Fix SPI: shorten timing & simplify API #11

Merged
sb10q merged 9 commits from fix-spi into master 2021-06-23 13:29:36 +08:00

This PR will improve timing on the SPI bus by achieving the following goals:

  1. Shorten the overall duration of SPI transaction.
  2. Reduce the complexity of user-defined parameters in the driver API.

Goal 1: Shorten the SPI timing

The following changes are relevant to reducing time required to perform any SPI transaction:

  1. Impose stricter NSS timing: as long as we provide a margin when calling the delay function (e.g. as simple as rounding up the calculation result for the CPU cycle count), it is safe to match the delay duration in the code with the value on the datasheet as close as possible.
  2. Use fixed-length SPI opcodes: as I commented on an issue (link), ENC424J600 provides a specific set of fixed-length SPI instructions which does not require CS_n deassertion in-between, saving time delayed for meeting the timing requirements to toggle CS_n. As a result, 8 types of SPI commands can finish earlier.

Goal 2: Simplify the API

This goal is mainly a direct response to my questions raised in #9. The following changes are relevant to simplifying the driver API:

  1. Impose cortex_m::asm::delay() for ns-delays: it is obvious that the embedded_hal SPI feature simply does not check that the SCK clock has stopped before the transfer function returns (c.f. issue #264 of embedded-hal repo), so we must impose delay in-between SPI transfer calls and CS_n assertions/deassertions to observe the SPI spec and the electrical timing behaviour. To precisely control the timing, this PR adds a crate feature cortex-m-cpu to enable "instruction cycle"-based delays for Cortex-M platforms. Further discussion in the next section.
  2. Eliminate delay_ns function pointer: no matter the platform supports cortex-m or not, one of the main caveats of having the user to supply delay_ns is that we cannot validate the function, and, more importantly, it complicates the API.
  3. Keep embedded_hal DelayUs for device reset: this is specific to ENC424J600, and all the user needs to do is to pass a mutable reference of a DelayUs implementation to the reset function call, often just once per appplication. The timescale is μs and not ns.

Discussion

Point 1: ns-delays are unlikely to appear on external embedded_hal-based libraries

As long as we stick with the nanosecond timescale for CS_n assertion/deassertion, I believe we do not need to rely on external definitions or implementations of accurate delays in this timescale. In my reasoning, every application is free to choose a system clock at lower megahertzes while adopting our driver, giving little motivation for the maintainers of embedded_hal libaries to include an additional DelayNs trait for portability.

By reading some of their discussions (e.g. issue #63 of embedded-hal repo), I have also gathered that most use cases of nanosecond delays could be worked around with interrupts or tick timers, which are hardware-specific. So, instead of borrowing an external portable feature from the HAL or elsewhere, we should simply define our own nanosecond delay feature here, which is based on instruction cycles on Cortex-M.

There is also a specific case of the AsmDelay crate for Cortex-M. It is capable of implementing a theoretical DelayNs, but I have my scepticism because it currently takes into account the fact that cortex_m::asm::delay() could take 2 CPU cycles on M7 rather than 1, but that does not seem to describe the behaviour of all Cortex-M cores. On M4 platforms such as Booster, delay() would take 3-4 cycles so the current implementation of AsmDelay would produce even longer and less accurate delays, which violates my goal 1. I can go and discuss this with the maintainers.

Point 2: CS_n delays are only implemented for Cortex-M targets and optionally

In this PR, we only care about producing ns-delays on Cortex-M platforms. In reality, embedded-hal is used on many other CPUs as well, so my PR has some degree of extensibility for other platforms in the future, and they are likely to be based on consumption of CPU cycles on their ISA -- if we assume that the CPU clock is at hundreds of MHz. No delay pointers should be required to extend ns-delays on other CPUs.

Besides, in my opinion, by letting the user choose to enable the cortex-m-cpu crate feature, if the CPU clock is too low to produce ns-delays any way, they can omit the ns-delays since the time between SPI transfer() calls and CS_n assertions/deassertions would be in the order of microseconds.

This PR will improve timing on the SPI bus by achieving the following goals: 1. Shorten the overall duration of SPI transaction. 2. Reduce the complexity of user-defined parameters in the driver API. ## Goal 1: Shorten the SPI timing The following changes are relevant to reducing time required to perform any SPI transaction: 1. **Impose stricter NSS timing:** as long as we provide a margin when calling the delay function (e.g. as simple as rounding up the calculation result for the CPU cycle count), it is safe to match the delay duration in the code with the value on the datasheet as close as possible. 2. **Use fixed-length SPI opcodes:** as I commented on an issue ([link](https://git.m-labs.hk/M-Labs/ENC424J600/issues/9#issuecomment-2125)), ENC424J600 provides a specific set of fixed-length SPI instructions which does not require CS_n deassertion in-between, saving time delayed for meeting the timing requirements to toggle CS_n. As a result, [8 types of SPI commands](https://git.m-labs.hk/M-Labs/ENC424J600/src/commit/fbcc3778d27cfbeec7a1395c9b13a00c8a26af9a/src/spi.rs#L18-L27) can finish earlier. ## Goal 2: Simplify the API This goal is mainly a direct response to my questions raised in #9. The following changes are relevant to simplifying the driver API: 1. **Impose `cortex_m::asm::delay()` for ns-delays:** it is obvious that the embedded_hal SPI feature simply does not check that the SCK clock has stopped before the transfer function returns (c.f. [issue #264 of embedded-hal repo](https://github.com/rust-embedded/embedded-hal/issues/264)), so we must impose delay in-between SPI transfer calls and CS_n assertions/deassertions to observe the SPI spec and the electrical timing behaviour. To precisely control the timing, this PR adds a crate feature `cortex-m-cpu` to enable "instruction cycle"-based delays for Cortex-M platforms. Further discussion in the next section. 2. **Eliminate `delay_ns` function pointer:** no matter the platform supports cortex-m or not, one of the main caveats of having the user to supply `delay_ns` is that we cannot validate the function, and, more importantly, it complicates the API. 3. **Keep embedded_hal `DelayUs` for device reset:** this is specific to ENC424J600, and all the user needs to do is to pass a mutable reference of a `DelayUs` implementation to the reset function call, often just once per appplication. The timescale is μs and not ns. ## Discussion ### Point 1: ns-delays are unlikely to appear on external embedded_hal-based libraries As long as we stick with the nanosecond timescale for CS_n assertion/deassertion, I believe we do not need to rely on external definitions or implementations of accurate delays in this timescale. In my reasoning, every application is free to choose a system clock at lower megahertzes while adopting our driver, giving little motivation for the maintainers of embedded_hal libaries to include an additional `DelayNs` trait for portability. By reading some of their discussions (e.g. [issue #63 of embedded-hal repo](https://github.com/rust-embedded/embedded-hal/issues/63)), I have also gathered that most use cases of nanosecond delays could be worked around with interrupts or tick timers, which are hardware-specific. So, instead of borrowing an external portable feature from the HAL or elsewhere, we should simply define our own nanosecond delay feature here, which is based on instruction cycles on Cortex-M. There is also a specific case of the `AsmDelay` crate for Cortex-M. It is capable of implementing a theoretical `DelayNs`, but I have my scepticism because it currently takes into account the fact that [`cortex_m::asm::delay()` could take 2 CPU cycles on M7](https://github.com/rust-embedded/cortex-m/blob/7b66016c1d680f78f56403c056dbe6ca3ebc0e23/asm/inline.rs#L55-L58) rather than 1, but that does not seem to describe the behaviour of all Cortex-M cores. On M4 platforms such as Booster, `delay()` would take 3-4 cycles so the current implementation of `AsmDelay` would produce even longer and less accurate delays, which violates my goal 1. I can go and discuss this with the maintainers. ### Point 2: CS_n delays are only implemented for Cortex-M targets and optionally In this PR, we only care about producing ns-delays on Cortex-M platforms. In reality, embedded-hal is used on many other CPUs as well, so my PR has some degree of extensibility for other platforms in the future, and they are likely to be based on consumption of CPU cycles on their ISA -- if we assume that the CPU clock is at hundreds of MHz. No delay pointers should be required to extend ns-delays on other CPUs. Besides, in my opinion, by letting the user choose to enable the `cortex-m-cpu` crate feature, if the CPU clock is too low to produce ns-delays any way, they can omit the ns-delays since the time between SPI `transfer()` calls and CS_n assertions/deassertions would be in the order of microseconds.
harry added 9 commits 2021-06-23 12:57:25 +08:00
sb10q merged commit fbcc3778d2 into master 2021-06-23 13:29:36 +08:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: renet/ENC424J600#11
There is no content yet.