Merge pull request #208 from vertigo-designs/feature/io-docs
Adding documentation, updating DAC output timing
This commit is contained in:
commit
d447501c47
@ -14,6 +14,9 @@ break DefaultHandler
|
||||
break HardFault
|
||||
break rust_begin_unwind
|
||||
|
||||
source ../../PyCortexMDebug/cmdebug/svd_gdb.py
|
||||
svd_load ~/Downloads/STM32H743x.svd
|
||||
|
||||
load
|
||||
# tbreak cortex_m_rt::reset_handler
|
||||
monitor reset halt
|
||||
|
89
src/adc.rs
89
src/adc.rs
@ -1,22 +1,77 @@
|
||||
///! Stabilizer ADC management interface
|
||||
///!
|
||||
///! The Stabilizer ADCs utilize three DMA channels each: one to trigger sampling, one to collect
|
||||
///! samples, and one to clear the EOT flag betwen samples. The SPI interfaces are configured
|
||||
///! for receiver-only operation. A timer channel is
|
||||
///! configured to generate a DMA write into the SPI CR1 register, which initiates a SPI transfer and
|
||||
///! results in a single ADC sample read for both channels. A separate timer channel is configured to
|
||||
///! occur immediately before the trigger channel, which initiates a write to the IFCR (flag-clear)
|
||||
///! register to clear the EOT flag, which allows for a new transmission to be generated by the
|
||||
///! trigger channel.
|
||||
///! # Design
|
||||
///!
|
||||
///! In order to read multiple samples without interrupting the CPU, a separate DMA transfer is
|
||||
///! configured to read from each of the ADC SPI RX FIFOs. Due to the design of the SPI peripheral,
|
||||
///! these DMA transfers stall when no data is available in the FIFO. Thus, the DMA transfer only
|
||||
///! completes after all samples have been read. When this occurs, a CPU interrupt is generated so
|
||||
///! that software can process the acquired samples from both ADCs. Only one of the ADC DMA streams
|
||||
///! is configured to generate an interrupt to handle both transfers, so it is necessary to ensure
|
||||
///! both transfers are completed before reading the data. This is usually not significant for
|
||||
///! busy-waiting because the transfers should complete at approximately the same time.
|
||||
///! Stabilizer ADCs are connected to the MCU via a simplex, SPI-compatible interface. The ADCs
|
||||
///! require a setup conversion time after asserting the CSn (convert) signal to generate the ADC
|
||||
///! code from the sampled level. Once the setup time has elapsed, the ADC data is clocked out of
|
||||
///! MISO. The internal setup time is managed by the SPI peripheral via a CSn setup time parameter
|
||||
///! during SPI configuration, which allows offloading the management of the setup time to hardware.
|
||||
///!
|
||||
///! Because of the SPI-compatibility of the ADCs, a single SPI peripheral + DMA is used to automate
|
||||
///! the collection of multiple ADC samples without requiring processing by the CPU, which reduces
|
||||
///! overhead and provides the CPU with more time for processing-intensive tasks, like DSP.
|
||||
///!
|
||||
///! The automation of sample collection utilizes three DMA streams, the SPI peripheral, and two
|
||||
///! timer compare channel for each ADC. One timer comparison channel is configured to generate a
|
||||
///! comparison event every time the timer is equal to a specific value. Each comparison then
|
||||
///! generates a DMA transfer event to write into the SPI CR1 register to initiate the transfer.
|
||||
///! This allows the SPI interface to periodically read a single sample. The other timer comparison
|
||||
///! channel is configured to generate a comparison event slightly before the first (~10 timer
|
||||
///! cycles). This channel triggers a separate DMA stream to clear the EOT flag within the SPI
|
||||
///! peripheral. The EOT flag must be cleared after each transfer or the SPI peripheral will not
|
||||
///! properly complete the single conversion. Thus, by using two DMA streams and timer comparison
|
||||
///! channels, the SPI can regularly acquire ADC samples.
|
||||
///!
|
||||
///! In order to collect the acquired ADC samples into a RAM buffer, a final DMA transfer is
|
||||
///! configured to read from the SPI RX FIFO into RAM. The request for this transfer is connected to
|
||||
///! the SPI RX data signal, so the SPI peripheral will request to move data into RAM whenever it is
|
||||
///! available. When enough samples have been collected, a transfer-complete interrupt is generated
|
||||
///! and the ADC samples are available for processing.
|
||||
///!
|
||||
///! The SPI peripheral internally has an 8- or 16-byte TX and RX FIFO, which corresponds to a 4- or
|
||||
///! 8-sample buffer for incoming ADC samples. During the handling of the DMA transfer completion,
|
||||
///! there is a small window where buffers are swapped over where it's possible that a sample could
|
||||
///! be lost. In order to avoid this, the SPI RX FIFO is effectively used as a "sample overflow"
|
||||
///! region and can buffer a number of samples until the next DMA transfer is configured. If a DMA
|
||||
///! transfer is still not set in time, the SPI peripheral will generate an input-overrun interrupt.
|
||||
///! This interrupt then serves as a means of detecting if samples have been lost, which will occur
|
||||
///! whenever data processing takes longer than the collection period.
|
||||
///!
|
||||
///!
|
||||
///! ## Starting Data Collection
|
||||
///!
|
||||
///! Because the DMA data collection is automated via timer count comparisons and DMA transfers, the
|
||||
///! ADCs can be initialized and configured, but will not begin sampling the external ADCs until the
|
||||
///! sampling timer is enabled. As such, the sampling timer should be enabled after all
|
||||
///! initialization has completed and immediately before the embedded processing loop begins.
|
||||
///!
|
||||
///!
|
||||
///! ## Batch Sizing
|
||||
///!
|
||||
///! The ADCs collect a group of N samples, which is referred to as a batch. The size of the batch
|
||||
///! is configured by the user at compile-time to allow for a custom-tailored implementation. Larger
|
||||
///! batch sizes generally provide for lower overhead and more processing time per sample, but come
|
||||
///! at the expense of increased input -> output latency.
|
||||
///!
|
||||
///!
|
||||
///! # Note
|
||||
///!
|
||||
///! While there are two ADCs, only a single ADC is configured to generate transfer-complete
|
||||
///! interrupts. This is done because it is assumed that the ADCs will always be sampled
|
||||
///! simultaneously. If only a single ADC is used, it must always be ADC0, as ADC1 will not generate
|
||||
///! transfer-complete interrupts.
|
||||
///!
|
||||
///! There is a very small amount of latency between sampling of ADCs due to bus matrix priority. As
|
||||
///! such, one of the ADCs will be sampled marginally earlier before the other because the DMA
|
||||
///! requests are generated simultaneously. This can be avoided by providing a known offset to the
|
||||
///! sample DMA requests, which can be completed by setting e.g. ADC0's comparison to a counter
|
||||
///! value of 0 and ADC1's comparison to a counter value of 1.
|
||||
///!
|
||||
///! In this implementation, single buffer mode DMA transfers are used because the SPI RX FIFO can
|
||||
///! be used as a means to both detect and buffer ADC samples during the buffer swap-over. Because
|
||||
///! of this, double-buffered mode does not offer any advantages over single-buffered mode (unless
|
||||
///! double-buffered mode offers less overhead due to the DMA disable/enable procedure).
|
||||
use super::{
|
||||
hal, timers, DMAReq, DmaConfig, MemoryToPeripheral, PeripheralToMemory,
|
||||
Priority, TargetAddress, Transfer, SAMPLE_BUFFER_SIZE,
|
||||
@ -191,7 +246,7 @@ macro_rules! adc_input {
|
||||
// Generate DMA events when an output compare of the timer hits the specified
|
||||
// value.
|
||||
trigger_channel.listen_dma();
|
||||
trigger_channel.to_output_compare(2);
|
||||
trigger_channel.to_output_compare(2 + $index);
|
||||
|
||||
// The trigger stream constantly writes to the SPI CR1 using a static word
|
||||
// (which is a static value to enable the SPI transfer). Thus, neither the
|
||||
|
126
src/dac.rs
126
src/dac.rs
@ -1,8 +1,55 @@
|
||||
///! Stabilizer DAC management interface
|
||||
///!
|
||||
///! The Stabilizer DAC utilize a DMA channel to generate output updates. A timer channel is
|
||||
///! configured to generate a DMA write into the SPI TXFIFO, which initiates a SPI transfer and
|
||||
///! results in DAC update for both channels.
|
||||
///! # Design
|
||||
///!
|
||||
///! Stabilizer DACs are connected to the MCU via a simplex, SPI-compatible interface. Each DAC
|
||||
///! accepts a 16-bit output code.
|
||||
///!
|
||||
///! In order to maximize CPU processing time, the DAC code updates are offloaded to hardware using
|
||||
///! a timer compare channel, DMA stream, and the DAC SPI interface.
|
||||
///!
|
||||
///! The timer comparison channel is configured to generate a DMA request whenever the comparison
|
||||
///! occurs. Thus, whenever a comparison happens, a single DAC code can be written to the output. By
|
||||
///! configuring a DMA stream for a number of successive DAC codes, hardware can regularly update
|
||||
///! the DAC without requiring the CPU.
|
||||
///!
|
||||
///! In order to ensure alignment between the ADC sample batches and DAC output code batches, a DAC
|
||||
///! output batch is always exactly 3 batches after the ADC batch that generated it.
|
||||
///!
|
||||
///! The DMA transfer for the DAC output codes utilizes a double-buffer mode to avoid losing any
|
||||
///! transfer events generated by the timer (for example, when 2 update cycles occur before the DMA
|
||||
///! transfer completion is handled). In this mode, by the time DMA swaps buffers, there is always a valid buffer in the
|
||||
///! "next-transfer" double-buffer location for the DMA transfer. Once a transfer completes,
|
||||
///! software then has exactly one batch duration to fill the next buffer before its
|
||||
///! transfer begins. If software does not meet this deadline, old data will be repeatedly generated
|
||||
///! on the output and output will be shifted by one batch.
|
||||
///!
|
||||
///! ## Multiple Samples to Single DAC Codes
|
||||
///!
|
||||
///! For some applications, it may be desirable to generate a single DAC code from multiple ADC
|
||||
///! samples. In order to maintain timing characteristics between ADC samples and DAC code outputs,
|
||||
///! applications are required to generate one DAC code for each ADC sample. To accomodate mapping
|
||||
///! multiple inputs to a single output, the output code can be repeated a number of times in the
|
||||
///! output buffer corresponding with the number of input samples that were used to generate it.
|
||||
///!
|
||||
///!
|
||||
///! # Note
|
||||
///!
|
||||
///! There is a very small amount of latency between updating the two DACs due to bus matrix
|
||||
///! priority. As such, one of the DACs will be updated marginally earlier before the other because
|
||||
///! the DMA requests are generated simultaneously. This can be avoided by providing a known offset
|
||||
///! to other DMA requests, which can be completed by setting e.g. DAC0's comparison to a
|
||||
///! counter value of 2 and DAC1's comparison to a counter value of 3. This will have the effect of
|
||||
///! generating the DAC updates with a known latency of 1 timer tick to each other and prevent the
|
||||
///! DMAs from racing for the bus. As implemented, the DMA channels utilize natural priority of the
|
||||
///! DMA channels to arbitrate which transfer occurs first.
|
||||
///!
|
||||
///!
|
||||
///! # Limitations
|
||||
///!
|
||||
///! While double-buffered mode is used for DMA to avoid lost DAC-update events, there is no check
|
||||
///! for re-use of a previously provided DAC output buffer. It is assumed that the DMA request is
|
||||
///! served promptly after the transfer completes.
|
||||
use super::{
|
||||
hal, timers, DMAReq, DmaConfig, MemoryToPeripheral, TargetAddress,
|
||||
Transfer, SAMPLE_BUFFER_SIZE,
|
||||
@ -13,8 +60,8 @@ use super::{
|
||||
// processed). Note that the contents of AXI SRAM is uninitialized, so the buffer contents on
|
||||
// startup are undefined. The dimensions are `ADC_BUF[adc_index][ping_pong_index][sample_index]`.
|
||||
#[link_section = ".axisram.buffers"]
|
||||
static mut DAC_BUF: [[[u16; SAMPLE_BUFFER_SIZE]; 2]; 2] =
|
||||
[[[0; SAMPLE_BUFFER_SIZE]; 2]; 2];
|
||||
static mut DAC_BUF: [[[u16; SAMPLE_BUFFER_SIZE]; 3]; 2] =
|
||||
[[[0; SAMPLE_BUFFER_SIZE]; 3]; 2];
|
||||
|
||||
macro_rules! dac_output {
|
||||
($name:ident, $index:literal, $data_stream:ident,
|
||||
@ -32,6 +79,16 @@ macro_rules! dac_output {
|
||||
) -> Self {
|
||||
Self { _channel, spi }
|
||||
}
|
||||
|
||||
/// Start the SPI and begin operating in a DMA-driven transfer mode.
|
||||
pub fn start_dma(&mut self) {
|
||||
// Allow the SPI FIFOs to operate using only DMA data channels.
|
||||
self.spi.enable_dma_tx();
|
||||
|
||||
// Enable SPI and start it in infinite transaction mode.
|
||||
self.spi.inner().cr1.modify(|_, w| w.spe().set_bit());
|
||||
self.spi.inner().cr1.modify(|_, w| w.cstart().started());
|
||||
}
|
||||
}
|
||||
|
||||
// Note(unsafe): This is safe because the DMA request line is logically owned by this module.
|
||||
@ -60,7 +117,6 @@ macro_rules! dac_output {
|
||||
MemoryToPeripheral,
|
||||
&'static mut [u16; SAMPLE_BUFFER_SIZE],
|
||||
>,
|
||||
first_transfer: bool,
|
||||
}
|
||||
|
||||
impl $name {
|
||||
@ -78,11 +134,12 @@ macro_rules! dac_output {
|
||||
// Generate DMA events when an output compare of the timer hitting zero (timer roll over)
|
||||
// occurs.
|
||||
trigger_channel.listen_dma();
|
||||
trigger_channel.to_output_compare(0);
|
||||
trigger_channel.to_output_compare(4 + $index);
|
||||
|
||||
// The stream constantly writes to the TX FIFO to write new update codes.
|
||||
let trigger_config = DmaConfig::default()
|
||||
.memory_increment(true)
|
||||
.double_buffer(true)
|
||||
.peripheral_increment(false);
|
||||
|
||||
// Listen for any potential SPI error signals, which may indicate that we are not generating
|
||||
@ -90,64 +147,53 @@ macro_rules! dac_output {
|
||||
let mut spi = spi.disable();
|
||||
spi.listen(hal::spi::Event::Error);
|
||||
|
||||
// Allow the SPI FIFOs to operate using only DMA data channels.
|
||||
spi.enable_dma_tx();
|
||||
|
||||
// Enable SPI and start it in infinite transaction mode.
|
||||
spi.inner().cr1.modify(|_, w| w.spe().set_bit());
|
||||
spi.inner().cr1.modify(|_, w| w.cstart().started());
|
||||
// AXISRAM is uninitialized. As such, we manually zero-initialize it here before
|
||||
// starting the transfer.
|
||||
// Note(unsafe): We currently own all DAC_BUF[index] buffers and are not using them
|
||||
// elsewhere, so it is safe to access them here.
|
||||
for buf in unsafe { DAC_BUF[$index].iter_mut() } {
|
||||
for byte in buf.iter_mut() {
|
||||
*byte = 0;
|
||||
}
|
||||
}
|
||||
|
||||
// Construct the trigger stream to write from memory to the peripheral.
|
||||
let transfer: Transfer<_, _, MemoryToPeripheral, _> =
|
||||
let mut transfer: Transfer<_, _, MemoryToPeripheral, _> =
|
||||
Transfer::init(
|
||||
stream,
|
||||
$spi::new(trigger_channel, spi),
|
||||
// Note(unsafe): This buffer is only used once and provided for the DMA transfer.
|
||||
unsafe { &mut DAC_BUF[$index][0] },
|
||||
None,
|
||||
// Note(unsafe): This buffer is only used once and provided for the DMA transfer.
|
||||
unsafe { Some(&mut DAC_BUF[$index][1]) },
|
||||
trigger_config,
|
||||
);
|
||||
|
||||
transfer.start(|spi| spi.start_dma());
|
||||
|
||||
Self {
|
||||
transfer,
|
||||
// Note(unsafe): This buffer is only used once and provided for the next DMA transfer.
|
||||
next_buffer: unsafe { Some(&mut DAC_BUF[$index][1]) },
|
||||
first_transfer: true,
|
||||
next_buffer: unsafe { Some(&mut DAC_BUF[$index][2]) },
|
||||
}
|
||||
}
|
||||
|
||||
/// Acquire the next output buffer to populate it with DAC codes.
|
||||
pub fn acquire_buffer(
|
||||
&mut self,
|
||||
) -> &'static mut [u16; SAMPLE_BUFFER_SIZE] {
|
||||
self.next_buffer.take().unwrap()
|
||||
}
|
||||
pub fn acquire_buffer(&mut self) -> &mut [u16; SAMPLE_BUFFER_SIZE] {
|
||||
// Note: If a device hangs up, check that this conditional is passing correctly, as
|
||||
// there is no time-out checks here in the interest of execution speed.
|
||||
while !self.transfer.get_transfer_complete_flag() {}
|
||||
|
||||
/// Enqueue the next buffer for transmission to the DAC.
|
||||
///
|
||||
/// # Args
|
||||
/// * `data` - The next data to write to the DAC.
|
||||
pub fn release_buffer(
|
||||
&mut self,
|
||||
next_buffer: &'static mut [u16; SAMPLE_BUFFER_SIZE],
|
||||
) {
|
||||
// If the last transfer was not complete, we didn't write all our previous DAC codes.
|
||||
// Wait for all the DAC codes to get written as well.
|
||||
if self.first_transfer {
|
||||
self.first_transfer = false
|
||||
} else {
|
||||
// Note: If a device hangs up, check that this conditional is passing correctly, as
|
||||
// there is no time-out checks here in the interest of execution speed.
|
||||
while !self.transfer.get_transfer_complete_flag() {}
|
||||
}
|
||||
let next_buffer = self.next_buffer.take().unwrap();
|
||||
|
||||
// Start the next transfer.
|
||||
self.transfer.clear_interrupts();
|
||||
let (prev_buffer, _, _) =
|
||||
self.transfer.next_transfer(next_buffer).unwrap();
|
||||
|
||||
// .unwrap_none() https://github.com/rust-lang/rust/issues/62633
|
||||
self.next_buffer.replace(prev_buffer);
|
||||
|
||||
self.next_buffer.as_mut().unwrap()
|
||||
}
|
||||
}
|
||||
};
|
||||
|
21
src/main.rs
21
src/main.rs
@ -949,6 +949,22 @@ const APP: () = {
|
||||
}
|
||||
}
|
||||
|
||||
/// Main DSP processing routine for Stabilizer.
|
||||
///
|
||||
/// # Note
|
||||
/// Processing time for the DSP application code is bounded by the following constraints:
|
||||
///
|
||||
/// DSP application code starts after the ADC has generated a batch of samples and must be
|
||||
/// completed by the time the next batch of ADC samples has been acquired (plus the FIFO buffer
|
||||
/// time). If this constraint is not met, firmware will panic due to an ADC input overrun.
|
||||
///
|
||||
/// The DSP application code must also fill out the next DAC output buffer in time such that the
|
||||
/// DAC can switch to it when it has completed the current buffer. If this constraint is not met
|
||||
/// it's possible that old DAC codes will be generated on the output and the output samples will
|
||||
/// be delayed by 1 batch.
|
||||
///
|
||||
/// Because the ADC and DAC operate at the same rate, these two constraints actually implement
|
||||
/// the same time bounds, meeting one also means the other is also met.
|
||||
#[task(binds=DMA1_STR4, resources=[pounder_stamper, adcs, dacs, iir_state, iir_ch, dds_output, input_stamper], priority=2)]
|
||||
fn process(c: process::Context) {
|
||||
if let Some(stamper) = c.resources.pounder_stamper {
|
||||
@ -960,6 +976,7 @@ const APP: () = {
|
||||
c.resources.adcs.0.acquire_buffer(),
|
||||
c.resources.adcs.1.acquire_buffer(),
|
||||
];
|
||||
|
||||
let dac_samples = [
|
||||
c.resources.dacs.0.acquire_buffer(),
|
||||
c.resources.dacs.1.acquire_buffer(),
|
||||
@ -993,10 +1010,6 @@ const APP: () = {
|
||||
|
||||
builder.write_profile();
|
||||
}
|
||||
|
||||
let [dac0, dac1] = dac_samples;
|
||||
c.resources.dacs.0.release_buffer(dac0);
|
||||
c.resources.dacs.1.release_buffer(dac1);
|
||||
}
|
||||
|
||||
#[idle(resources=[net_interface, pounder, mac_addr, eth_mac, iir_state, iir_ch, afes])]
|
||||
|
@ -1,4 +1,57 @@
|
||||
///! The DdsOutput is used as an output stream to the pounder DDS.
|
||||
///!
|
||||
///! # Design
|
||||
///!
|
||||
///! The DDS stream interface is a means of quickly updating pounder DDS (direct digital synthesis)
|
||||
///! outputs of the AD9959 DDS chip. The DDS communicates via a quad-SPI interface and a single
|
||||
///! IO-update output pin.
|
||||
///!
|
||||
///! In order to update the DDS interface, the frequency tuning word, amplitude control word, and
|
||||
///! the phase offset word for a channel can be modified to change the frequency, amplitude, or
|
||||
///! phase on any of the 4 available output channels. Changes do not propagate to DDS outputs until
|
||||
///! the IO-update pin is toggled high to activate the new configurations. This allows multiple
|
||||
///! channels or parameters to be updated and then effects can take place simultaneously.
|
||||
///!
|
||||
///! In this implementation, the phase, frequency, or amplitude can be updated for any single
|
||||
///! collection of outputs simultaneously. This is done by serializing the register writes to the
|
||||
///! DDS into a single buffer of data and then writing the data over QSPI to the DDS.
|
||||
///!
|
||||
///! In order to minimize software overhead, data is written directly into the QSPI output FIFO. In
|
||||
///! order to accomplish this most efficiently, serialized data is written as 32-bit words to
|
||||
///! minimize the number of bus cycles necessary to write to the peripheral FIFO. A consequence of
|
||||
///! this is that additional unneeded register writes may be appended to align a transfer to 32-bit
|
||||
///! word sizes.
|
||||
///!
|
||||
///! In order to pulse the IO-update signal, the high-resolution timer output is used. The timer is
|
||||
///! configured to assert the IO-update signal after a predefined delay and then de-assert the
|
||||
///! signal after a predefined assertion duration. This allows for the actual QSPI transfer and
|
||||
///! IO-update toggle to be completed asynchronously to the rest of software processing - that is,
|
||||
///! software can schedule the DDS updates and then continue data processing. DDS updates then take
|
||||
///! place in the future when the IO-update is toggled by hardware.
|
||||
///!
|
||||
///!
|
||||
///! # Limitations
|
||||
///!
|
||||
///! The QSPI output FIFO is used as an intermediate buffer for holding pending QSPI writes. Because
|
||||
///! of this, the implementation only supports up to 16 serialized bytes (the QSPI FIFO is 4 32-bit
|
||||
///! words wide) in a single update.
|
||||
///!
|
||||
///! There is currently no synchronization between completion of the QSPI data write and the
|
||||
///! IO-update signal. It is currently assumed that the QSPI transfer will always complete within a
|
||||
///! predefined delay (the pre-programmed IO-update timer delay).
|
||||
///!
|
||||
///!
|
||||
///! # Future Improvement
|
||||
///!
|
||||
///! In the future, it would be possible to utilize a DMA transfer to complete the QSPI transfer.
|
||||
///! Once the QSPI transfer completed, this could trigger the IO-update timer to start to
|
||||
///! asynchronously complete IO-update automatically. This would allow for arbitrary profile sizes
|
||||
///! and ensure that IO-update was in-sync with the QSPI transfer.
|
||||
///!
|
||||
///! Currently, serialization is performed on each processing cycle. If there is a
|
||||
///! compile-time-known register update sequence needed for the application, the serialization
|
||||
///! process can be done once and then register values can be written into a pre-computed serialized
|
||||
///! buffer to avoid the software overhead of much of the serialization process.
|
||||
use super::QspiInterface;
|
||||
use crate::hrtimer::HighResTimerE;
|
||||
use ad9959::{Channel, DdsConfig, ProfileSerializer};
|
||||
|
Loading…
Reference in New Issue
Block a user