Fix SPI: shorten timing & simplify API #11

harry · 2021-06-23T12:57:25+08:00

harry commented

2021-06-23 12:57:25 +08:00

This PR will improve timing on the SPI bus by achieving the following goals:

Shorten the overall duration of SPI transaction.
Reduce the complexity of user-defined parameters in the driver API.

Goal 1: Shorten the SPI timing

The following changes are relevant to reducing time required to perform any SPI transaction:

Impose stricter NSS timing: as long as we provide a margin when calling the delay function (e.g. as simple as rounding up the calculation result for the CPU cycle count), it is safe to match the delay duration in the code with the value on the datasheet as close as possible.
Use fixed-length SPI opcodes: as I commented on an issue (link), ENC424J600 provides a specific set of fixed-length SPI instructions which does not require CS_n deassertion in-between, saving time delayed for meeting the timing requirements to toggle CS_n. As a result, 8 types of SPI commands can finish earlier.

Goal 2: Simplify the API

This goal is mainly a direct response to my questions raised in #9. The following changes are relevant to simplifying the driver API:

Impose cortex_m::asm::delay() for ns-delays: it is obvious that the embedded_hal SPI feature simply does not check that the SCK clock has stopped before the transfer function returns (c.f. issue #264 of embedded-hal repo), so we must impose delay in-between SPI transfer calls and CS_n assertions/deassertions to observe the SPI spec and the electrical timing behaviour. To precisely control the timing, this PR adds a crate feature cortex-m-cpu to enable "instruction cycle"-based delays for Cortex-M platforms. Further discussion in the next section.
Eliminate delay_ns function pointer: no matter the platform supports cortex-m or not, one of the main caveats of having the user to supply delay_ns is that we cannot validate the function, and, more importantly, it complicates the API.
Keep embedded_hal DelayUs for device reset: this is specific to ENC424J600, and all the user needs to do is to pass a mutable reference of a DelayUs implementation to the reset function call, often just once per appplication. The timescale is μs and not ns.

Discussion

Point 1: ns-delays are unlikely to appear on external embedded_hal-based libraries

As long as we stick with the nanosecond timescale for CS_n assertion/deassertion, I believe we do not need to rely on external definitions or implementations of accurate delays in this timescale. In my reasoning, every application is free to choose a system clock at lower megahertzes while adopting our driver, giving little motivation for the maintainers of embedded_hal libaries to include an additional DelayNs trait for portability.

By reading some of their discussions (e.g. issue #63 of embedded-hal repo), I have also gathered that most use cases of nanosecond delays could be worked around with interrupts or tick timers, which are hardware-specific. So, instead of borrowing an external portable feature from the HAL or elsewhere, we should simply define our own nanosecond delay feature here, which is based on instruction cycles on Cortex-M.

There is also a specific case of the AsmDelay crate for Cortex-M. It is capable of implementing a theoretical DelayNs, but I have my scepticism because it currently takes into account the fact that cortex_m::asm::delay() could take 2 CPU cycles on M7 rather than 1, but that does not seem to describe the behaviour of all Cortex-M cores. On M4 platforms such as Booster, delay() would take 3-4 cycles so the current implementation of AsmDelay would produce even longer and less accurate delays, which violates my goal 1. I can go and discuss this with the maintainers.

Point 2: CS_n delays are only implemented for Cortex-M targets and optionally

In this PR, we only care about producing ns-delays on Cortex-M platforms. In reality, embedded-hal is used on many other CPUs as well, so my PR has some degree of extensibility for other platforms in the future, and they are likely to be based on consumption of CPU cycles on their ISA -- if we assume that the CPU clock is at hundreds of MHz. No delay pointers should be required to extend ns-delays on other CPUs.

Besides, in my opinion, by letting the user choose to enable the cortex-m-cpu crate feature, if the CPU clock is too low to produce ns-delays any way, they can omit the ns-delays since the time between SPI transfer() calls and CS_n assertions/deassertions would be in the order of microseconds.

This PR will improve timing on the SPI bus by achieving the following goals: 1. Shorten the overall duration of SPI transaction. 2. Reduce the complexity of user-defined parameters in the driver API. ## Goal 1: Shorten the SPI timing The following changes are relevant to reducing time required to perform any SPI transaction: 1. **Impose stricter NSS timing:** as long as we provide a margin when calling the delay function (e.g. as simple as rounding up the calculation result for the CPU cycle count), it is safe to match the delay duration in the code with the value on the datasheet as close as possible. 2. **Use fixed-length SPI opcodes:** as I commented on an issue ([link](https://git.m-labs.hk/M-Labs/ENC424J600/issues/9#issuecomment-2125)), ENC424J600 provides a specific set of fixed-length SPI instructions which does not require CS_n deassertion in-between, saving time delayed for meeting the timing requirements to toggle CS_n. As a result, [8 types of SPI commands](https://git.m-labs.hk/M-Labs/ENC424J600/src/commit/fbcc3778d27cfbeec7a1395c9b13a00c8a26af9a/src/spi.rs#L18-L27) can finish earlier. ## Goal 2: Simplify the API This goal is mainly a direct response to my questions raised in #9. The following changes are relevant to simplifying the driver API: 1. **Impose `cortex_m::asm::delay()` for ns-delays:** it is obvious that the embedded_hal SPI feature simply does not check that the SCK clock has stopped before the transfer function returns (c.f. [issue #264 of embedded-hal repo](https://github.com/rust-embedded/embedded-hal/issues/264)), so we must impose delay in-between SPI transfer calls and CS_n assertions/deassertions to observe the SPI spec and the electrical timing behaviour. To precisely control the timing, this PR adds a crate feature `cortex-m-cpu` to enable "instruction cycle"-based delays for Cortex-M platforms. Further discussion in the next section. 2. **Eliminate `delay_ns` function pointer:** no matter the platform supports cortex-m or not, one of the main caveats of having the user to supply `delay_ns` is that we cannot validate the function, and, more importantly, it complicates the API. 3. **Keep embedded_hal `DelayUs` for device reset:** this is specific to ENC424J600, and all the user needs to do is to pass a mutable reference of a `DelayUs` implementation to the reset function call, often just once per appplication. The timescale is μs and not ns. ## Discussion ### Point 1: ns-delays are unlikely to appear on external embedded_hal-based libraries As long as we stick with the nanosecond timescale for CS_n assertion/deassertion, I believe we do not need to rely on external definitions or implementations of accurate delays in this timescale. In my reasoning, every application is free to choose a system clock at lower megahertzes while adopting our driver, giving little motivation for the maintainers of embedded_hal libaries to include an additional `DelayNs` trait for portability. By reading some of their discussions (e.g. [issue #63 of embedded-hal repo](https://github.com/rust-embedded/embedded-hal/issues/63)), I have also gathered that most use cases of nanosecond delays could be worked around with interrupts or tick timers, which are hardware-specific. So, instead of borrowing an external portable feature from the HAL or elsewhere, we should simply define our own nanosecond delay feature here, which is based on instruction cycles on Cortex-M. There is also a specific case of the `AsmDelay` crate for Cortex-M. It is capable of implementing a theoretical `DelayNs`, but I have my scepticism because it currently takes into account the fact that [`cortex_m::asm::delay()` could take 2 CPU cycles on M7](https://github.com/rust-embedded/cortex-m/blob/7b66016c1d680f78f56403c056dbe6ca3ebc0e23/asm/inline.rs#L55-L58) rather than 1, but that does not seem to describe the behaviour of all Cortex-M cores. On M4 platforms such as Booster, `delay()` would take 3-4 cycles so the current implementation of `AsmDelay` would produce even longer and less accurate delays, which violates my goal 1. I can go and discuss this with the maintainers. ### Point 2: CS_n delays are only implemented for Cortex-M targets and optionally In this PR, we only care about producing ns-delays on Cortex-M platforms. In reality, embedded-hal is used on many other CPUs as well, so my PR has some degree of extensibility for other platforms in the future, and they are likely to be based on consumption of CPU cycles on their ISA -- if we assume that the CPU clock is at hundreds of MHz. No delay pointers should be required to extend ns-delays on other CPUs. Besides, in my opinion, by letting the user choose to enable the `cortex-m-cpu` crate feature, if the CPU clock is too low to produce ns-delays any way, they can omit the ns-delays since the time between SPI `transfer()` calls and CS_n assertions/deassertions would be in the order of microseconds.

harry added 9 commits 2021-06-23 12:57:25 +08:00

27ba42c4fb spi: Introduce certain 1 & 3-byte opcodes to replace reg read/writes

d05d7f91e2 spi: Simplify all reg reads/writes as rw_n()

35b7924431 spi: Add back NSS high delay conditionally based on opcode type

ec20970a50 spi: Impose stricter NSS timing

9de8d77a24 Restore embedded_hal DelayUs for device init

999ca5f08a spi: Replace delay_ns func pointer with delay on cortex-m

* SPI NSS nanosecond delays now only takes place on Cortex-M CPUs

d8b1132b8a examples: Replace delay_ns with cortex-m delay

bb6824b944 nal: Remove delay_ns func pointer

fbcc3778d2 spi: Always delay for NSS setup time

* Fixes the lack of setup time delay of an fixed-byte command that immediately follows an N-byte command.
* Extra 50ns will be inserted in-between fixed-byte commands but should be acceptable.

sb10q merged commit fbcc3778d2 into master

2021-06-23 13:29:36 +08:00

harry referenced this pull request

2021-06-25 11:56:59 +08:00

Can SPI chip select timing be ensured without delay_ns? #9

Sign in to join this conversation.

No reviewers

No Label

No Milestone

No project

No Assignees

1 Participants

Notifications

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: renet/ENC424J600#11