See also commit feed91d; that commit fixed the test_rpc_timing test,
but caused frequent hangs elsewhere, which were also caused by buggy
Nagle implementation. Just disable this entirely, as with our
explicit buffering it provides no benefit anyway.
This brings mean RPC time from ~45ms to ~2ms.
The cause of the slowness without buffering is, primarily, that lwip
is severely pessimized by small writes, whether with Nagle on or off.
(In fact, disabling Nagle makes it function *better* on many small
writes, which begs the question of what's the point of having Nagle
there in the first place.) In practical terms, the slowness appears
only when writing a 4-byte buffer (the synchronization segment);
writing buffers of other sizes does not trigger the problem.
This all is extremely confusing and the fix is partly palliative,
but since it seems to work reliably and we're migrating off lwip
I think it is unwise to spend any more time debugging this.
It takes ~4ms to print an empty log line because of how slow
the UART is. This makes the log timestamps useless for debugging
performance problems.
After this commit, it takes ~75us to print an empty log line instead,
which pessimizes test_rpc_timing by less than 2ms with tracing
enabled.
This enables constant propagation optimisations, as verified by
the included test case. This is only a first stop-gap measure, though;
we should support optimisation based on kernel invariants on a more
fine-grained level.
Previously, the phi emitted for the pointer parameter to recv_rpc was –
rather confusingly – called "size", and the pseudo-code in the comment
had bit-rotted.
Signed-off-by: David Nadlinger <code@klickverbot.at>