Commit Graph

221 Commits

Author SHA1 Message Date
whitequark c19bd1da7c Make elaborated reasons for sending a TCP packet more precise. 2017-12-15 06:31:50 +00:00
whitequark 1c7ce85004 Log an elaborated reason for sending a TCP packet.
Otherwise, it is hard to debug e.g. ACK loops.
2017-12-15 05:24:17 +00:00
whitequark 6c83936872 Extract socket handle into a new SocketMeta structure.
This paves way for adding more metadata apart from handles,
such as caches and so on.

This commit also removes SocketHandle::EMPTY in favor of
SocketHandle::default() since they are functionally identical.
2017-11-22 03:50:09 +00:00
Kai Lüke 2a52234d53 Don't panic on TCP sequence number wrapping in debug mode. 2017-11-13 18:03:36 +00:00
Kai Lüke 3db035bbac Preserve retransmission timer for Dup-ACKs
Duplicate ACKs should not replace the retransmission timer,
but if not in retransmission, still set the keep-alive timer
as normal ACKs do.
2017-11-13 06:26:26 +00:00
Dan Robertson adb5014780 Add ICMP sockets
- Add support for ICMP sockets
 - Add tests for ICMP sockets
 - Rename proto-<type> features to socket-<type>
 - Update documentation
2017-11-09 00:08:24 +00:00
whitequark fe6b04a29a Rework TcpSocket::{send,recv} to remove need for precomputing size.
Now, these functions give you the largest contiguous slice they can
grab, and you return however much you took from it.
2017-10-31 19:24:54 +00:00
whitequark 1cd3625bf4 Small adjustments to TTL code; style, docs, and tests. NFCI. 2017-10-24 23:12:55 +00:00
Dan Robertson eef65d2676 Implement set_ttl for Tcp and Udp sockets
- Add the ttl member to the IpRepr
 - Add the ttl member along with setters and getters to the tcp and udp
   socket types
 - Add unit tests for the new set_ttl parameter
 - Update usage of IpRepr to include the ttl value
2017-10-24 23:02:18 +00:00
whitequark 52600cd521 Replace "socket debug identifiers" with just socket handles.
This is basically a rename that now calls an apple an apple,
except user code can no longer change it. It's not obvious if
user code getting the socket handle from the socket is very useful,
but it's not harmful either, so why not.
2017-10-05 03:02:41 +00:00
Egor Karavaev 5be76b8dc1 Add support for IPv4 default gateway. 2017-10-03 15:17:29 +00:00
Steffen Butzer f9093d15aa phy: introduce hardware based checksum settings, rename DeviceLimits
this contains a rename of occurrences of
DeviceLimits -> DeviceCapabilities.
2017-10-02 21:40:08 +00:00
whitequark 9c3534e452 A few cosmetic changes to TCP socket trace messages.
No functional change.
2017-09-24 13:39:23 +00:00
whitequark 32d720831a Completely redo the logic of TCP socket polling.
The previous implementation made no sense. It is obvious that
poll_at() should use the same mechanisms to decide whether dispatch()
should be called as dispatch() itself uses to decide whether to send
anything.

This fixes numerous busy looping issues that arise if the return
value of poll() is used for waiting.
2017-09-24 13:20:56 +00:00
whitequark 96b284a30f Fix two issues that may cause TCP sockets to be polled too late.
1. Apart from non-empty transmit buffer, a state which transmits
   a FIN flag should also be considerd. Otherwise, closing a socket
   with an empty transmit buffer may retransmit the FIN flag forever.
2. Timeout poll requests should only be overridden by timer poll
   requests when the latter is earlier.
2017-09-24 11:04:55 +00:00
whitequark 8db752eb7a Fix a typo in documentation. 2017-09-24 08:23:46 +00:00
whitequark 711c05f7b1 Make TCP more RFC 5681 compliant wrt immediate ACKs. 2017-09-22 18:57:47 +00:00
whitequark a5a9bb44c9 Clarify inline TCP documentation. 2017-09-22 18:40:23 +00:00
whitequark b90cee43bb Fix a warning. 2017-09-22 17:38:58 +00:00
whitequark 8166037a91 Handle buffer wraparound in TCP reassembly code. 2017-09-22 17:10:51 +00:00
whitequark 5aae15aaba Fix an issue where TCP packets would have zero IP payload length. 2017-09-22 10:14:26 +00:00
whitequark a9d6a9b7b8 Fix a few warnings. 2017-09-22 10:06:40 +00:00
whitequark 7a2cd7842d Implement reassembly of out-of-order TCP segments. 2017-09-22 10:06:40 +00:00
whitequark b6e4e23868 Clean up PSH flag handling. 2017-09-22 08:15:29 +00:00
whitequark d7da46fd3b Do not clear remote_last_ts in TcpSocket::set_timeout().
After this change, if an 1s timeout is set on a socket that received
no packet for 2s, it will be instantly aborted, which seems
more reasonable. Also, this makes the `self.remote_last_ts.is_none()`
branch in TcpSocket::dispatch() actually behave as described.
2017-09-22 06:16:58 +00:00
whitequark f23bfe2014 Clean up and document TCP MSS calculations. 2017-09-22 06:02:03 +00:00
whitequark 9281733209 Implement TCP timeouts. 2017-09-18 11:05:40 +00:00
whitequark fe80bca19d Implement TCP keep-alive. 2017-09-16 10:54:59 +00:00
whitequark be0dcb145c Fix warnings. 2017-09-15 06:05:41 +00:00
whitequark e8788be3a0 RingBuffer::{empty,full}→is_{empty,full}.
Query methods in Rust conventionally start with the "is" prefix.
2017-09-08 23:23:40 +00:00
whitequark 5ffce2c1a6 Remove Error::Rejected.
This wasn't an actual error, just a poorly designed communication
mechanism between sockets and whatever lies on the layer below them.
2017-09-08 00:59:46 +00:00
Egor Karavaev 8404fe908c Factor out TcpSocket::accepts. 2017-09-08 00:57:42 +00:00
whitequark 9b242c7099 Send a challenge ACK in response to an unacceptable TCP ACK. 2017-09-08 00:11:44 +00:00
whitequark 1102bd94e7 Refactor the "random access" ring buffer interface. 2017-09-07 23:47:42 +00:00
whitequark 5dc0353b2a Refactor the "continuous" ring buffer interface.
This also makes TcpSocket::{send,recv}_slice slightly more efficient
in case when the slice wraps around the corresponding buffer,
halving the necessary amount of calls.
2017-09-07 21:17:31 +00:00
whitequark a9719f4a13 Merge the TCP ring buffer and generic ring buffer.
This adds a few methods to RingBuffer that don't quite fit into
its interface (the slice ones), but we can fix that later.
2017-09-07 00:09:34 +00:00
whitequark 27a23ed3c3 Don't instantly ACK every received TCP segment.
This is prohibited by RFC 1122. Instead, aggregate them, and respond
once in the egress function.

However, still instantly send challenge ACKs to trigger fast
retransmit.
2017-09-05 23:23:50 +00:00
whitequark a71542b2f2 Allow querying the size of the TCP transmit and receive buffers.
This may be useful to e.g. ensure the complete transmit buffer
is flushed before proceeding.
2017-08-31 13:22:20 +00:00
whitequark 13dbe09fba TCP socket debug messages "sending <flags>" should be at DEBUG level.
They're not exceptional.
2017-08-31 12:39:05 +00:00
whitequark 80e510c9e2 Dispatch a TCP ACK every time window increases at all.
Commit 21282fdb was not completely sufficient because e.g. receiving
one octet and then blocking meant that an ACK with window length
of one is then sent, and this isn't motivating the other TCP stacks
all that much.

In any case, preemptively notifying the peer of window size increase
can only be beneficial, modulo network congestion.
2017-08-31 00:14:29 +00:00
whitequark 5e48567297 Send a TCP ACK after window increases from zero to non-zero. 2017-08-30 23:09:31 +00:00
whitequark 6324d3384a More rigorously treat the TcpSocket::remote_last_ack field.
Zero is a valid sequence number, treating it as an absence of value
isn't any good. This is unlikely to cause any real harm but it
just isn't good practice, nor does it make for understandable code.
2017-08-30 22:40:07 +00:00
whitequark 41ceeea9ac Fix the TCP SEQ acceptability check.
It has nothing to do with the last ACK transmitted.
2017-08-30 22:26:35 +00:00
whitequark c2bc20e9bf Move the TCP receive window clamping hack downwards in stack.
Otherwise, our response ACKs did not get the clamping treatment,
and severe packet loss resulted.

Also, explain why it's needed and how it works.
2017-08-30 14:00:14 +00:00
whitequark 7b3574e6ee Rework TCP retransmit logic to be much more robust.
Before this commit, if the amount of data in the buffer caused it
to be split among many outgoing packets, and retransmit timer
was active, the socket would behave very erratically and flood
the peer.
2017-08-30 10:12:53 +00:00
whitequark 996389d653 Compute soft deadline in poll() and use nonblocking sockets.
Before this commit, anything that touched RawSocket or TapInterface
worked partly by accident and partly because of a horrible crutch
that resulted in massive latencies as well as inevitable packet loss
every time an ARP request had to be issued. Also, there was no way
to use poll() other than by continuously calling it in a busy loop.

After this commit, poll() indicates when the earliest timer expires,
and so the caller can sleep until that moment (or until packets
arrive).

Note that there is a subtle problem remaining: every time poll()
is called, every socket with a pending outbound packet whose
IP address doesn't correspond to a MAC address will send a new
ARP request, resulting in potentially a whole lot of such requests.
ARP rate limiting is a separate topic though.
2017-08-29 19:47:11 +00:00
whitequark 7cd7bd4683 Reset the timer transitioning from TCP FIN-WAIT-1 to FIN-WAIT-2.
We don't transmit anything in FIN-WAIT-2, so we don't need the timer
running, or we'll get spurious log messages about retransmissions.
This also makes logic cleaner, although with no functional change.
2017-08-29 19:47:11 +00:00
whitequark f84b234a38 Add a test for TCP sockets returning ACKs even with zero window. 2017-08-28 06:01:21 +00:00
whitequark 34db543cac Factor out packet parsing from Socket::process.
Not only is it incredibly wasteful, but this information is
in any case already necessary within the EthernetInterface::process_*
methods.
2017-08-28 05:48:16 +00:00
whitequark 379bc60924 Get rid of unused arguments in Socket::{process,dispatch}.
I've left those "for consistency" but it just implies data
dependencies where there are none, and bloats signatures.
2017-08-28 04:14:25 +00:00
whitequark 760174048d Get rid of IpPayload and indirection in Socket::dispatch.
This was just completely pointless, and only served to obfuscate
the data path and make testing harder.
2017-08-28 03:56:34 +00:00
whitequark 1d01189278 Use FnOnce, not FnMut, in Socket::dispatch() functions.
There was never any reason to use FnMut and this significantly
simplifies the job of the borrow checker.
2017-08-28 00:59:33 +00:00
whitequark 4878f0f1d6 Fix a TCP retransmit loop after transition from SYN-SENT to ESTABLISHED. 2017-08-25 06:47:12 +00:00
whitequark 3035ef07fb Fix a bug that caused TCP packets with PSH bit to be dropped. 2017-08-25 06:27:00 +00:00
whitequark afdf73ffef TcpRepr::push → TcpControl::Psh.
This is done for simplification. FIN implies PSH, RST doesn't have
any meaning with PSH, and SYN|PSH only makes sense in the context
of TCP Fast Open, in the context of which, any data in the original
SYN already implies PSH.
2017-08-25 06:05:17 +00:00
whitequark 5396687d52 Immediately ACK payload in response to TCP packets that have any.
This will lower the average software RTT by about a factor of two.
2017-08-25 05:50:43 +00:00
whitequark 5b2de544c8 Radically simplify and optimize TCP packet dispatch.
This commit completely reworks packet dispatch in TCP sockets,
and brings significant improvements to processing as well.

In particular:
  * Challenge ACKs now do not reset retransmit timer; instead,
    TcpSocket::process directly returns a TcpRepr without altering
    any internal state at all.
  * Retransmit and close (aka TIME-WAIT) timers are unified
    and restructured into a enum that actually matches semantics
    of the timers.
  * If a packet cannot be emitted, no internal state is changed.
  * The dispatch of RST packets in case of connection abort
    is brought in line with dispatch of all other packets.
  * Packet dispatch now follows a series of steps with clean
    separation of concerns, like packet processing:
      1. If we should retransmit, update state to assume that
         all in-flight packets are lost.
      2. Prepare the packet that would be sent next, considering
         the in-flight packets, if any.
      3. Check if the packet contains anything new, or it's the same
         as the one already in flight. If it is, bail.
      4. Finalize and try to actually transmit the packet.
         If we can't do that, bail.
      5. Update the internal state to reflect that the packet
         we've just sent is in flight.
2017-08-25 03:53:31 +00:00
whitequark a0d359fc53 Make TCP packets not matching socket state return Error::Dropped.
Error::Malformed is only for internally inconsistent packets,
like SYN|FIN.
2017-08-25 03:52:30 +00:00
whitequark 9d0084171f Rework responses to TCP packets and factor in RST replies to TcpSocket. 2017-08-22 22:32:05 +00:00
whitequark f9c6c0249b Listening TCP sockets should reject, not drop, ACKs. 2017-08-21 07:29:32 +00:00
whitequark 39f328b80f Add TcpSocket::{peek,peek_slice}. 2017-08-21 07:28:38 +00:00
whitequark e0d8fcfb72 Emit exceptional events with the DEBUG log level, not TRACE. 2017-08-01 11:21:58 +00:00
whitequark db5ecb353a Only return Err(Rejected) from TcpSocket::process on endpoint mismatch.
Otherwise, a future SocketSet that tries to route packets by using
IP/port numbers instead of brute force would result in different
semantics.
2017-08-01 11:15:12 +00:00
whitequark 7825bc6070 Implement fmt::Write for TcpSocket. 2017-07-30 06:59:01 +00:00
whitequark ae903e8841 Add UdpSocket::is_open, similar to TcpSocket::is_open in function.
Fixes #31.
2017-07-30 01:17:58 +00:00
whitequark 02f005a2bb Put the debug_id field first in sockets.
This has been annoying me far too long.
2017-07-28 11:55:59 +00:00
whitequark ad12573f62 Rework and test UDP sockets.
Before, errors such as packets not fitting into a buffer would have
resulted in panics, and errors such as unbound sockets were
simply ignored.
2017-07-27 22:30:01 +00:00
whitequark 8d8a4ea583 Get rid of Result<_, ()>.
The use of this type has several drawbacks:
  * It does not allow distinguishing between different error
    conditions. In fact, we wrongly conflated some of them
    before this commit.
  * It does not allow propagation via ? and requires manual use
    of map_err, which is especially tiresome for downstream code.
  * It prevents us from expanding the set of error conditions
    even if right now we have only one.
  * It prevents us from blanket using Result<T> everywhere
    (a nitpick at most).

Instead, use Result<T, Error> everywhere, and differentiate error
conditions where applicable.
2017-07-27 13:55:47 +00:00
whitequark 8a2432dcd7 Rework error handling in TcpSocket::connect. 2017-07-27 12:27:33 +00:00
whitequark 24bb0eab9d Update a stale docstring. 2017-07-27 11:26:39 +00:00
whitequark a3423b35f4 as_unspecified → to_unspecified 2017-07-27 11:26:07 +00:00
whitequark 38afc64f61 Accept TCP FIN packets in SYN-RECEIVED state. 2017-07-23 23:51:56 +00:00
whitequark 1c41f2d7fa Fix determination of local address from incoming packets.
We've advertised this capability before in examples, but it did not
actually work.
2017-07-23 23:07:55 +00:00
whitequark bdfc47d633 Log correct delay when performing TCP retransmit.
Before, the delay was erroneously multiplied by 2 and also did not
take processing delay into account.
2017-07-23 05:09:38 +00:00
whitequark 5556f09351 Annotate all simple getters with #[inline]. 2017-07-04 18:46:36 +00:00
whitequark 3f9805b2c1 Remove *Socket::{process,dispatch} from public interface.
These no longer have to be public, since our required Rust version
has pub(crate).
2017-06-30 20:55:46 +00:00
whitequark 938fb99070 In {Tcp,Udp}Socket::process, make protocol check an assertion.
We filter sockets by type upstream of the process() calls.
2017-06-30 19:17:14 +00:00
whitequark cbf6e5cdbc Try to trigger fast retransmit when we detect a missing TCP segment.
The changes in this commit affect the following scenario:
  * Remote end sends octets 1..2, they are delivered and buffered
    on local end;
  * Remote end sends octets 3..4, they are lost;
  * Remote end sends octets 5..6, they are delivered but cannot
    be buffered on local end because we don't perform reassembly.

Before this commit, we would silently drop the segment with octets
5..6, relying on retransmission timer on the remote end. This works,
but can result in severe decrease in throughput. After this commit,
we send a duplicate ACK, which may trigger fast retransmit, if
implemented by the congestion control algorithm on the remote end.
2017-06-26 08:44:07 +00:00
whitequark 050dd788c1 Set TCP PSH flag when sending the last buffered data. 2017-06-25 23:12:30 +00:00
whitequark b86d22701d Expose PSH flag in TcpRepr. 2017-06-25 08:20:25 +00:00
whitequark db75f70fa3 Fix a few typos in TCP socket code (NFC). 2017-06-25 08:05:37 +00:00
whitequark 74823b0dff try! → ? 2017-06-24 16:34:32 +00:00
whitequark 8b27330c8b Do not attempt to validate length of packets being emitted.
This is a form of an uninitialized read bug; although safe it caused
panics. In short, transmit buffers received from the network stack
should be considered uninitialized (in practice they will often
contain previously transmitted packets or parts thereof). Wrapping
them with the only method we had (e.g. Ipv4Packet) treated the buffer
as if it contained a valid incoming packet, which can easily fail
with Error::Truncated.

This commit splits every `fn new(buffer: T) -> Result<Self, Error>`
method on a `Packet` into three smaller ones:
  * `fn check_len(&self) -> Result<(), Error>`, purely a validator;
  * `fn new(T) -> Self`, purely a wrapper;
  * `fn new_checked(T) -> Result<Self, Error>`, a validating wrapper.

This makes it easy to process ingress packets (using `new_checked`),
egress packets (using `new`), and, if needed, maintain the invariants
at any point during packet construction (using `check_len`).

Fixes #17.
2017-06-24 11:42:32 +00:00
whitequark f29b610801 Remove accidentally committed change. 2017-05-29 17:15:35 +00:00
whitequark 91ef5c60c3 Add packet shaping to the fault injector. 2017-05-29 10:53:30 +00:00
whitequark 73bb4b8593 Implement TCP TIME-WAIT timeouts. 2017-04-21 16:01:49 +00:00
whitequark 555825e49e Clamp TCP receive window to MSS multiplied by maximum burst size.
This is a conservative bound; if we don't have enough buffers
to receive more than four segments, clearly we shouldn't advertise
our ability to.

It however will only work reliably with exactly one TCP connection
continuously receiving; for two, another window adjustment mechanism
will be needed for reliable reception.
2017-03-07 11:21:49 +00:00
whitequark 1d46ccf432 fn Device::mtu() -> usize → Device::limits() -> DeviceLimits 2017-03-07 11:10:26 +00:00
whitequark c73298f01e Improve handling of TCP ACK packets in FIN-* states.
Previously, sockets could get stuck in the CLOSING state, after
the sequence described in the new test_mutual_close_with_data_2.
The root cause was that some state machine transitions got
folded into ACK handling.

Now, all transitions are handled in the same match statement,
and ACK handling is broken up around it.
2017-03-07 10:17:30 +00:00
whitequark 0836bc949a Fix condition for acceptable RST|ACK in TCP SYN-SENT state. 2017-03-05 05:31:12 +00:00
whitequark cd894460f5 Implement the TCP SYN-SENT state. 2017-03-05 03:53:04 +00:00
whitequark e36e05905c Fix an incorrect tracing message. 2017-03-05 03:16:15 +00:00
whitequark 255d69d63b Fix the TCP MSS calculation. 2017-01-31 11:39:33 +00:00
whitequark da4900a02b Fix an incorrect payload length when sending TCP MSS option. 2017-01-27 03:35:22 +00:00
whitequark d6b7623c37 Receive the TCP MSS option and act on it. 2017-01-27 03:06:52 +00:00
whitequark bc1d65ea89 Send the TCP MSS option. 2017-01-27 02:56:27 +00:00
whitequark 077513fda6 Add support for TCP MSS option in TCP representation. 2017-01-27 02:56:27 +00:00
whitequark 4267ad2635 Fix an inaccurate comment. 2017-01-26 22:04:05 +00:00
whitequark 9b8671ce15 Fix the TCP FIN emission with queued data rolling over TX buffer. 2017-01-25 06:20:57 +00:00
whitequark 9fbb5cfff1 Fix the TCP ACK handling in FIN-WAIT-1 state with queued data. 2017-01-25 06:01:58 +00:00