The CPU is allowed to access normal memory writes out-of-order. Here
the write to the OWN flag in the DMA descriptor (normal memory) was
placed after the DMA tail pointer advance (in device memory, so not
reorderable). This meant the ethernet DMA engine stalled as it saw
a descriptor it did not own, and only restarted and sent the packet
when the next packet was released.
This fix will work as long as the CPU data cache is disabled. If we
want to enable the cache, the simplest method would be to mark SRAM3
as uncacheable via the MPU.