Merge remote-tracking branch 'origin/master' into lockin-bin

* origin/master:
  Updating timer compare offsets
  Removing dac isr clear
  Adding information about DSP timing specifications
  Specifying consequences of failing to meet timing
  Updating dependencies
  Apply suggestions from code review
  Adding documentation about double-buffered mode to DACs
  Updating DACs to utilize DBM
  Adding safety documentation
  Adding docs
  Updating DAC output format, adding DDS stream docs
  Adding documentation for ADCs and DACs
master
Robert Jördens 2021-01-19 11:12:50 +01:00
commit 720b143132
5 changed files with 231 additions and 60 deletions

View File

@ -14,6 +14,9 @@ break DefaultHandler
break HardFault
break rust_begin_unwind
source ../../PyCortexMDebug/cmdebug/svd_gdb.py
svd_load ~/Downloads/STM32H743x.svd
load
# tbreak cortex_m_rt::reset_handler
monitor reset halt

View File

@ -1,22 +1,77 @@
///! Stabilizer ADC management interface
///!
///! The Stabilizer ADCs utilize three DMA channels each: one to trigger sampling, one to collect
///! samples, and one to clear the EOT flag betwen samples. The SPI interfaces are configured
///! for receiver-only operation. A timer channel is
///! configured to generate a DMA write into the SPI CR1 register, which initiates a SPI transfer and
///! results in a single ADC sample read for both channels. A separate timer channel is configured to
///! occur immediately before the trigger channel, which initiates a write to the IFCR (flag-clear)
///! register to clear the EOT flag, which allows for a new transmission to be generated by the
///! trigger channel.
///! # Design
///!
///! In order to read multiple samples without interrupting the CPU, a separate DMA transfer is
///! configured to read from each of the ADC SPI RX FIFOs. Due to the design of the SPI peripheral,
///! these DMA transfers stall when no data is available in the FIFO. Thus, the DMA transfer only
///! completes after all samples have been read. When this occurs, a CPU interrupt is generated so
///! that software can process the acquired samples from both ADCs. Only one of the ADC DMA streams
///! is configured to generate an interrupt to handle both transfers, so it is necessary to ensure
///! both transfers are completed before reading the data. This is usually not significant for
///! busy-waiting because the transfers should complete at approximately the same time.
///! Stabilizer ADCs are connected to the MCU via a simplex, SPI-compatible interface. The ADCs
///! require a setup conversion time after asserting the CSn (convert) signal to generate the ADC
///! code from the sampled level. Once the setup time has elapsed, the ADC data is clocked out of
///! MISO. The internal setup time is managed by the SPI peripheral via a CSn setup time parameter
///! during SPI configuration, which allows offloading the management of the setup time to hardware.
///!
///! Because of the SPI-compatibility of the ADCs, a single SPI peripheral + DMA is used to automate
///! the collection of multiple ADC samples without requiring processing by the CPU, which reduces
///! overhead and provides the CPU with more time for processing-intensive tasks, like DSP.
///!
///! The automation of sample collection utilizes three DMA streams, the SPI peripheral, and two
///! timer compare channel for each ADC. One timer comparison channel is configured to generate a
///! comparison event every time the timer is equal to a specific value. Each comparison then
///! generates a DMA transfer event to write into the SPI CR1 register to initiate the transfer.
///! This allows the SPI interface to periodically read a single sample. The other timer comparison
///! channel is configured to generate a comparison event slightly before the first (~10 timer
///! cycles). This channel triggers a separate DMA stream to clear the EOT flag within the SPI
///! peripheral. The EOT flag must be cleared after each transfer or the SPI peripheral will not
///! properly complete the single conversion. Thus, by using two DMA streams and timer comparison
///! channels, the SPI can regularly acquire ADC samples.
///!
///! In order to collect the acquired ADC samples into a RAM buffer, a final DMA transfer is
///! configured to read from the SPI RX FIFO into RAM. The request for this transfer is connected to
///! the SPI RX data signal, so the SPI peripheral will request to move data into RAM whenever it is
///! available. When enough samples have been collected, a transfer-complete interrupt is generated
///! and the ADC samples are available for processing.
///!
///! The SPI peripheral internally has an 8- or 16-byte TX and RX FIFO, which corresponds to a 4- or
///! 8-sample buffer for incoming ADC samples. During the handling of the DMA transfer completion,
///! there is a small window where buffers are swapped over where it's possible that a sample could
///! be lost. In order to avoid this, the SPI RX FIFO is effectively used as a "sample overflow"
///! region and can buffer a number of samples until the next DMA transfer is configured. If a DMA
///! transfer is still not set in time, the SPI peripheral will generate an input-overrun interrupt.
///! This interrupt then serves as a means of detecting if samples have been lost, which will occur
///! whenever data processing takes longer than the collection period.
///!
///!
///! ## Starting Data Collection
///!
///! Because the DMA data collection is automated via timer count comparisons and DMA transfers, the
///! ADCs can be initialized and configured, but will not begin sampling the external ADCs until the
///! sampling timer is enabled. As such, the sampling timer should be enabled after all
///! initialization has completed and immediately before the embedded processing loop begins.
///!
///!
///! ## Batch Sizing
///!
///! The ADCs collect a group of N samples, which is referred to as a batch. The size of the batch
///! is configured by the user at compile-time to allow for a custom-tailored implementation. Larger
///! batch sizes generally provide for lower overhead and more processing time per sample, but come
///! at the expense of increased input -> output latency.
///!
///!
///! # Note
///!
///! While there are two ADCs, only a single ADC is configured to generate transfer-complete
///! interrupts. This is done because it is assumed that the ADCs will always be sampled
///! simultaneously. If only a single ADC is used, it must always be ADC0, as ADC1 will not generate
///! transfer-complete interrupts.
///!
///! There is a very small amount of latency between sampling of ADCs due to bus matrix priority. As
///! such, one of the ADCs will be sampled marginally earlier before the other because the DMA
///! requests are generated simultaneously. This can be avoided by providing a known offset to the
///! sample DMA requests, which can be completed by setting e.g. ADC0's comparison to a counter
///! value of 0 and ADC1's comparison to a counter value of 1.
///!
///! In this implementation, single buffer mode DMA transfers are used because the SPI RX FIFO can
///! be used as a means to both detect and buffer ADC samples during the buffer swap-over. Because
///! of this, double-buffered mode does not offer any advantages over single-buffered mode (unless
///! double-buffered mode offers less overhead due to the DMA disable/enable procedure).
use super::{
hal, timers, DMAReq, DmaConfig, MemoryToPeripheral, PeripheralToMemory,
Priority, TargetAddress, Transfer, SAMPLE_BUFFER_SIZE,
@ -191,7 +246,7 @@ macro_rules! adc_input {
// Generate DMA events when an output compare of the timer hits the specified
// value.
trigger_channel.listen_dma();
trigger_channel.to_output_compare(2);
trigger_channel.to_output_compare(2 + $index);
// The trigger stream constantly writes to the SPI CR1 using a static word
// (which is a static value to enable the SPI transfer). Thus, neither the

View File

@ -1,8 +1,55 @@
///! Stabilizer DAC management interface
///!
///! The Stabilizer DAC utilize a DMA channel to generate output updates. A timer channel is
///! configured to generate a DMA write into the SPI TXFIFO, which initiates a SPI transfer and
///! results in DAC update for both channels.
///! # Design
///!
///! Stabilizer DACs are connected to the MCU via a simplex, SPI-compatible interface. Each DAC
///! accepts a 16-bit output code.
///!
///! In order to maximize CPU processing time, the DAC code updates are offloaded to hardware using
///! a timer compare channel, DMA stream, and the DAC SPI interface.
///!
///! The timer comparison channel is configured to generate a DMA request whenever the comparison
///! occurs. Thus, whenever a comparison happens, a single DAC code can be written to the output. By
///! configuring a DMA stream for a number of successive DAC codes, hardware can regularly update
///! the DAC without requiring the CPU.
///!
///! In order to ensure alignment between the ADC sample batches and DAC output code batches, a DAC
///! output batch is always exactly 3 batches after the ADC batch that generated it.
///!
///! The DMA transfer for the DAC output codes utilizes a double-buffer mode to avoid losing any
///! transfer events generated by the timer (for example, when 2 update cycles occur before the DMA
///! transfer completion is handled). In this mode, by the time DMA swaps buffers, there is always a valid buffer in the
///! "next-transfer" double-buffer location for the DMA transfer. Once a transfer completes,
///! software then has exactly one batch duration to fill the next buffer before its
///! transfer begins. If software does not meet this deadline, old data will be repeatedly generated
///! on the output and output will be shifted by one batch.
///!
///! ## Multiple Samples to Single DAC Codes
///!
///! For some applications, it may be desirable to generate a single DAC code from multiple ADC
///! samples. In order to maintain timing characteristics between ADC samples and DAC code outputs,
///! applications are required to generate one DAC code for each ADC sample. To accomodate mapping
///! multiple inputs to a single output, the output code can be repeated a number of times in the
///! output buffer corresponding with the number of input samples that were used to generate it.
///!
///!
///! # Note
///!
///! There is a very small amount of latency between updating the two DACs due to bus matrix
///! priority. As such, one of the DACs will be updated marginally earlier before the other because
///! the DMA requests are generated simultaneously. This can be avoided by providing a known offset
///! to other DMA requests, which can be completed by setting e.g. DAC0's comparison to a
///! counter value of 2 and DAC1's comparison to a counter value of 3. This will have the effect of
///! generating the DAC updates with a known latency of 1 timer tick to each other and prevent the
///! DMAs from racing for the bus. As implemented, the DMA channels utilize natural priority of the
///! DMA channels to arbitrate which transfer occurs first.
///!
///!
///! # Limitations
///!
///! While double-buffered mode is used for DMA to avoid lost DAC-update events, there is no check
///! for re-use of a previously provided DAC output buffer. It is assumed that the DMA request is
///! served promptly after the transfer completes.
use super::{
hal, timers, DMAReq, DmaConfig, MemoryToPeripheral, TargetAddress,
Transfer, SAMPLE_BUFFER_SIZE,
@ -13,8 +60,8 @@ use super::{
// processed). Note that the contents of AXI SRAM is uninitialized, so the buffer contents on
// startup are undefined. The dimensions are `ADC_BUF[adc_index][ping_pong_index][sample_index]`.
#[link_section = ".axisram.buffers"]
static mut DAC_BUF: [[[u16; SAMPLE_BUFFER_SIZE]; 2]; 2] =
[[[0; SAMPLE_BUFFER_SIZE]; 2]; 2];
static mut DAC_BUF: [[[u16; SAMPLE_BUFFER_SIZE]; 3]; 2] =
[[[0; SAMPLE_BUFFER_SIZE]; 3]; 2];
macro_rules! dac_output {
($name:ident, $index:literal, $data_stream:ident,
@ -32,6 +79,16 @@ macro_rules! dac_output {
) -> Self {
Self { _channel, spi }
}
/// Start the SPI and begin operating in a DMA-driven transfer mode.
pub fn start_dma(&mut self) {
// Allow the SPI FIFOs to operate using only DMA data channels.
self.spi.enable_dma_tx();
// Enable SPI and start it in infinite transaction mode.
self.spi.inner().cr1.modify(|_, w| w.spe().set_bit());
self.spi.inner().cr1.modify(|_, w| w.cstart().started());
}
}
// Note(unsafe): This is safe because the DMA request line is logically owned by this module.
@ -60,7 +117,6 @@ macro_rules! dac_output {
MemoryToPeripheral,
&'static mut [u16; SAMPLE_BUFFER_SIZE],
>,
first_transfer: bool,
}
impl $name {
@ -78,11 +134,12 @@ macro_rules! dac_output {
// Generate DMA events when an output compare of the timer hitting zero (timer roll over)
// occurs.
trigger_channel.listen_dma();
trigger_channel.to_output_compare(0);
trigger_channel.to_output_compare(4 + $index);
// The stream constantly writes to the TX FIFO to write new update codes.
let trigger_config = DmaConfig::default()
.memory_increment(true)
.double_buffer(true)
.peripheral_increment(false);
// Listen for any potential SPI error signals, which may indicate that we are not generating
@ -90,64 +147,53 @@ macro_rules! dac_output {
let mut spi = spi.disable();
spi.listen(hal::spi::Event::Error);
// Allow the SPI FIFOs to operate using only DMA data channels.
spi.enable_dma_tx();
// Enable SPI and start it in infinite transaction mode.
spi.inner().cr1.modify(|_, w| w.spe().set_bit());
spi.inner().cr1.modify(|_, w| w.cstart().started());
// AXISRAM is uninitialized. As such, we manually zero-initialize it here before
// starting the transfer.
// Note(unsafe): We currently own all DAC_BUF[index] buffers and are not using them
// elsewhere, so it is safe to access them here.
for buf in unsafe { DAC_BUF[$index].iter_mut() } {
for byte in buf.iter_mut() {
*byte = 0;
}
}
// Construct the trigger stream to write from memory to the peripheral.
let transfer: Transfer<_, _, MemoryToPeripheral, _> =
let mut transfer: Transfer<_, _, MemoryToPeripheral, _> =
Transfer::init(
stream,
$spi::new(trigger_channel, spi),
// Note(unsafe): This buffer is only used once and provided for the DMA transfer.
unsafe { &mut DAC_BUF[$index][0] },
None,
// Note(unsafe): This buffer is only used once and provided for the DMA transfer.
unsafe { Some(&mut DAC_BUF[$index][1]) },
trigger_config,
);
transfer.start(|spi| spi.start_dma());
Self {
transfer,
// Note(unsafe): This buffer is only used once and provided for the next DMA transfer.
next_buffer: unsafe { Some(&mut DAC_BUF[$index][1]) },
first_transfer: true,
next_buffer: unsafe { Some(&mut DAC_BUF[$index][2]) },
}
}
/// Acquire the next output buffer to populate it with DAC codes.
pub fn acquire_buffer(
&mut self,
) -> &'static mut [u16; SAMPLE_BUFFER_SIZE] {
self.next_buffer.take().unwrap()
}
pub fn acquire_buffer(&mut self) -> &mut [u16; SAMPLE_BUFFER_SIZE] {
// Note: If a device hangs up, check that this conditional is passing correctly, as
// there is no time-out checks here in the interest of execution speed.
while !self.transfer.get_transfer_complete_flag() {}
/// Enqueue the next buffer for transmission to the DAC.
///
/// # Args
/// * `data` - The next data to write to the DAC.
pub fn release_buffer(
&mut self,
next_buffer: &'static mut [u16; SAMPLE_BUFFER_SIZE],
) {
// If the last transfer was not complete, we didn't write all our previous DAC codes.
// Wait for all the DAC codes to get written as well.
if self.first_transfer {
self.first_transfer = false
} else {
// Note: If a device hangs up, check that this conditional is passing correctly, as
// there is no time-out checks here in the interest of execution speed.
while !self.transfer.get_transfer_complete_flag() {}
}
let next_buffer = self.next_buffer.take().unwrap();
// Start the next transfer.
self.transfer.clear_interrupts();
let (prev_buffer, _, _) =
self.transfer.next_transfer(next_buffer).unwrap();
// .unwrap_none() https://github.com/rust-lang/rust/issues/62633
self.next_buffer.replace(prev_buffer);
self.next_buffer.as_mut().unwrap()
}
}
};

View File

@ -977,6 +977,22 @@ const APP: () = {
}
}
/// Main DSP processing routine for Stabilizer.
///
/// # Note
/// Processing time for the DSP application code is bounded by the following constraints:
///
/// DSP application code starts after the ADC has generated a batch of samples and must be
/// completed by the time the next batch of ADC samples has been acquired (plus the FIFO buffer
/// time). If this constraint is not met, firmware will panic due to an ADC input overrun.
///
/// The DSP application code must also fill out the next DAC output buffer in time such that the
/// DAC can switch to it when it has completed the current buffer. If this constraint is not met
/// it's possible that old DAC codes will be generated on the output and the output samples will
/// be delayed by 1 batch.
///
/// Because the ADC and DAC operate at the same rate, these two constraints actually implement
/// the same time bounds, meeting one also means the other is also met.
#[task(binds=DMA1_STR4, resources=[pounder_stamper, adcs, dacs, iir_state, iir_ch, dds_output, input_stamper, timestamp_handler, iir_lockin, iir_state_lockin], priority=2)]
fn process(c: process::Context) {
if let Some(stamper) = c.resources.pounder_stamper {
@ -988,6 +1004,7 @@ const APP: () = {
c.resources.adcs.0.acquire_buffer(),
c.resources.adcs.1.acquire_buffer(),
];
let dac_samples = [
c.resources.dacs.0.acquire_buffer(),
c.resources.dacs.1.acquire_buffer(),
@ -1053,9 +1070,6 @@ const APP: () = {
builder.write_profile();
}
c.resources.dacs.0.release_buffer(dac0);
c.resources.dacs.1.release_buffer(dac1);
}
#[idle(resources=[net_interface, pounder, mac_addr, eth_mac, iir_state, iir_ch, afes])]

View File

@ -1,4 +1,57 @@
///! The DdsOutput is used as an output stream to the pounder DDS.
///!
///! # Design
///!
///! The DDS stream interface is a means of quickly updating pounder DDS (direct digital synthesis)
///! outputs of the AD9959 DDS chip. The DDS communicates via a quad-SPI interface and a single
///! IO-update output pin.
///!
///! In order to update the DDS interface, the frequency tuning word, amplitude control word, and
///! the phase offset word for a channel can be modified to change the frequency, amplitude, or
///! phase on any of the 4 available output channels. Changes do not propagate to DDS outputs until
///! the IO-update pin is toggled high to activate the new configurations. This allows multiple
///! channels or parameters to be updated and then effects can take place simultaneously.
///!
///! In this implementation, the phase, frequency, or amplitude can be updated for any single
///! collection of outputs simultaneously. This is done by serializing the register writes to the
///! DDS into a single buffer of data and then writing the data over QSPI to the DDS.
///!
///! In order to minimize software overhead, data is written directly into the QSPI output FIFO. In
///! order to accomplish this most efficiently, serialized data is written as 32-bit words to
///! minimize the number of bus cycles necessary to write to the peripheral FIFO. A consequence of
///! this is that additional unneeded register writes may be appended to align a transfer to 32-bit
///! word sizes.
///!
///! In order to pulse the IO-update signal, the high-resolution timer output is used. The timer is
///! configured to assert the IO-update signal after a predefined delay and then de-assert the
///! signal after a predefined assertion duration. This allows for the actual QSPI transfer and
///! IO-update toggle to be completed asynchronously to the rest of software processing - that is,
///! software can schedule the DDS updates and then continue data processing. DDS updates then take
///! place in the future when the IO-update is toggled by hardware.
///!
///!
///! # Limitations
///!
///! The QSPI output FIFO is used as an intermediate buffer for holding pending QSPI writes. Because
///! of this, the implementation only supports up to 16 serialized bytes (the QSPI FIFO is 4 32-bit
///! words wide) in a single update.
///!
///! There is currently no synchronization between completion of the QSPI data write and the
///! IO-update signal. It is currently assumed that the QSPI transfer will always complete within a
///! predefined delay (the pre-programmed IO-update timer delay).
///!
///!
///! # Future Improvement
///!
///! In the future, it would be possible to utilize a DMA transfer to complete the QSPI transfer.
///! Once the QSPI transfer completed, this could trigger the IO-update timer to start to
///! asynchronously complete IO-update automatically. This would allow for arbitrary profile sizes
///! and ensure that IO-update was in-sync with the QSPI transfer.
///!
///! Currently, serialization is performed on each processing cycle. If there is a
///! compile-time-known register update sequence needed for the application, the serialization
///! process can be done once and then register values can be written into a pre-computed serialized
///! buffer to avoid the software overhead of much of the serialization process.
use super::QspiInterface;
use crate::hrtimer::HighResTimerE;
use ad9959::{Channel, DdsConfig, ProfileSerializer};