This brings mean RPC time from ~45ms to ~2ms.
The cause of the slowness without buffering is, primarily, that lwip
is severely pessimized by small writes, whether with Nagle on or off.
(In fact, disabling Nagle makes it function *better* on many small
writes, which begs the question of what's the point of having Nagle
there in the first place.) In practical terms, the slowness appears
only when writing a 4-byte buffer (the synchronization segment);
writing buffers of other sizes does not trigger the problem.
This all is extremely confusing and the fix is partly palliative,
but since it seems to work reliably and we're migrating off lwip
I think it is unwise to spend any more time debugging this.
It takes ~4ms to print an empty log line because of how slow
the UART is. This makes the log timestamps useless for debugging
performance problems.
After this commit, it takes ~75us to print an empty log line instead,
which pessimizes test_rpc_timing by less than 2ms with tracing
enabled.
This enables constant propagation optimisations, as verified by
the included test case. This is only a first stop-gap measure, though;
we should support optimisation based on kernel invariants on a more
fine-grained level.
Previously, the phi emitted for the pointer parameter to recv_rpc was –
rather confusingly – called "size", and the pseudo-code in the comment
had bit-rotted.
Signed-off-by: David Nadlinger <code@klickverbot.at>
session.rs has code like:
while !rpc_queue::empty() {
try!(process_kern_queued_rpc(stream, &mut session))
}
// A
if mailbox::receive() != 0 {
try!(process_kern_message(waiter, Some(stream), &mut session));
}
If both an async and a mailbox RPC (async or large sync) are posted
at point A then they will be processed out of order.
This commit fixes the issue by flushing the async RPC queue before
posting any RPC to the mailbox.