loop servo time step / PID autotune #11

Closed
opened 2020-12-05 07:59:29 +08:00 by jbqubit · 19 comments

There exist simple, huristics for setting PID parameters based on observation of loop oscillation. See for example Ziegler Nichols method. Applying this type of method requires knowledge of how the loop filter treats time. It would be helpful if the relationship between Ku and Tu were determined and added to README.md.

There exist simple, huristics for setting PID parameters based on observation of loop oscillation. See for example [Ziegler Nichols method](https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method). Applying this type of method requires knowledge of how the loop filter treats time. It would be helpful if the relationship between Ku and Tu were determined and added to README.md.

The the time unit is the AD7172's sampling interval. I have omitted factoring in a time delta because the PID input is this hardware DAC with supposedly very regular read intervals. Tell me if that is not practical so that I'll implement that.

The the time unit is the AD7172's sampling interval. I have omitted factoring in a time delta because the PID input is this hardware DAC with supposedly very regular read intervals. Tell me if that is not practical so that I'll implement that.

Just having the units for the gain coefficients (kp/ki/kd) documented would be helpful (Hz, s, …).

Just having the units for the gain coefficients (kp/ki/kd) documented would be helpful (Hz, s, …).

What is the ADC sample period?

The period in the equations I'm referring to is the servo loop period. Does the firmware include a mechanism for strictly controlling this period?

What is the ADC sample period? The period in the equations I'm referring to is the servo loop period. Does the firmware include a mechanism for strictly controlling this period?

Just having the units for the gain coefficients (kp/ki/kd) documented would be helpful (Hz, s, …).

Coefficient Unit
kp 1
ki 10 Hz
kd 0.1 Seconds

What is the ADC sample period?

10 Hz

The period in the equations I’m referring to is the servo loop period.

Any conversion and output is in the same mainloop iteration with reading that ADC sample for one channel.

Does the firmware include a mechanism for strictly controlling this period?

No, which is according to specification. If required, we can make the rate configurable and add a more precise time delta to the calculations.

> Just having the units for the gain coefficients (kp/ki/kd) documented would be helpful (Hz, s, …). | Coefficient | Unit | | --- | --- | | kp | 1 | | ki | 10 Hz | | kd | 0.1 Seconds | > What is the ADC sample period? [10 Hz](https://git.m-labs.hk/M-Labs/thermostat/src/branch/master/src/ad7172/adc.rs#L93) > The period in the equations I’m referring to is the servo loop period. Any conversion and output is in the same mainloop iteration with reading that ADC sample for one channel. > Does the firmware include a mechanism for strictly controlling this period? No, which is according to specification. If required, we can make the rate configurable and add a more precise time delta to the calculations.

At present if there's an edit to the firmware that computes feedback, the loop filter periods may change. This would then require edits to kp, ki, kd as used in the lab. Better to choose a well-defined loop filter period and enforce timing by some mechanism.

At present if there's an edit to the firmware that computes feedback, the loop filter periods may change. This would then require edits to kp, ki, kd as used in the lab. Better to choose a well-defined loop filter period and enforce timing by some mechanism.

Does the firmware include a mechanism for strictly controlling this period?

You need to be more careful with language here since you're talking across domains of expertise so it’s easy for things to get lost in translation. Better would be to ask what the current method for controlling the period is, what the jitter is etc.

Anyway, IIRC currently the loop is controlled by the ADC, which has internal hardware timing. The period there is determined by the post-filter, which can be altered by the user, but defaults to 10Hz (so NB that @astro’s answer to the question about units is not strictly true since IIRC it doesn’t normalize for the sampling period; better would be to rescale the coefficients in the python driver to make the gain independent of the sampling period).

The CPU then has software polling (not an interrupt) to check the ADC samples being ready, runs a simple temperature conversion + control loop, and writes to the SPI DAC (currently big banged). So there is some jitter introduced to the loop by all of this….however, do we care? For a 10Hz sample rate, the loop update period is 100ms. So even if all the software processing has 1ms of latency (which is huge) it won’t make a perceptible difference.

So, it depends by what you mean by “strictly” here, but I would say that yes, to the extent that we care, this is strictly controlled.

> Does the firmware include a mechanism for strictly controlling this period? You need to be more careful with language here since you're talking across domains of expertise so it’s easy for things to get lost in translation. Better would be to ask what the current method for controlling the period is, what the jitter is etc. Anyway, IIRC currently the loop is controlled by the ADC, which has internal hardware timing. The period there is determined by the post-filter, which can be altered by the user, but defaults to 10Hz (so NB that @astro’s answer to the question about units is not strictly true since IIRC it doesn’t normalize for the sampling period; better would be to rescale the coefficients in the python driver to make the gain independent of the sampling period). The CPU then has software polling (not an interrupt) to check the ADC samples being ready, runs a simple temperature conversion + control loop, and writes to the SPI DAC (currently big banged). So there is some jitter introduced to the loop by all of this….however, do we care? For a 10Hz sample rate, the loop update period is 100ms. So even if all the software processing has 1ms of latency (which is *huge*) it won’t make a perceptible difference. So, it depends by what you mean by “strictly” here, but I would say that yes, to the extent that we care, this is strictly controlled.

BTW I would definitely not consider this firmware production ready yet. Last time I looked over it the firmware did not do what I expect in various places (e.g. the zero-current calibration looked wrong). This isn’t yet at the point of being small bugs that need fixing, but rather misunderstandings over how the circuit/firmware should operate which will lead to surprising results.

IMHO it would not be a good use of anyone’s time to try to debug/improve this by filing issues about how the observed behaviour deviates from expectations; that will just lead to time wasted on XY problems. The firmware needs a full line-by-line design review from someone who understands how the circuit + firmware are supposed to operate and until that’s happened I would discourage people from trying to use it. (To be clear, the firmware is nicely written and it’s mainly there, there are just some misunderstandings which will cause pain. The changes to fix these will be small measured in terms of lines of code, but the review needed to find them will take a lot of time from someone with the right skill base...)

BTW I would definitely not consider this firmware production ready yet. Last time I looked over it the firmware did not do what I expect in various places (e.g. the zero-current calibration looked wrong). This isn’t yet at the point of being small bugs that need fixing, but rather misunderstandings over how the circuit/firmware should operate which will lead to surprising results. IMHO it would not be a good use of anyone’s time to try to debug/improve this by filing issues about how the observed behaviour deviates from expectations; that will just lead to time wasted on XY problems. The firmware needs a full line-by-line design review from someone who understands how the circuit + firmware are supposed to operate and until that’s happened I would discourage people from trying to use it. (To be clear, the firmware is nicely written and it’s mainly there, there are just some misunderstandings which will cause pain. The changes to fix these will be small measured in terms of lines of code, but the review needed to find them will take a lot of time from someone with the right skill base...)

So there is some jitter introduced to the loop by all of this….however, do we care?

Agreed that ms scale changes in loop filter period don't matter for controlling temperature of devices with inhently long time constants. However, shifting to a clock-determined (vs loop execution length) period can avoid future annoying complexity if additional computation is added to the in-loop computation.

IMHO it would not be a good use of anyone’s time to try to debug/improve this by filing issues about how the observed behaviour deviates from expectations; that will just lead to time wasted on XY problems.

> So there is some jitter introduced to the loop by all of this….however, do we care? Agreed that ms scale changes in loop filter period don't matter for controlling temperature of devices with inhently long time constants. However, shifting to a clock-determined (vs loop execution length) period can avoid future annoying complexity if additional computation is added to the in-loop computation. > IMHO it would not be a good use of anyone’s time to try to debug/improve this by filing issues about how the observed behaviour deviates from expectations; that will just lead to time wasted on XY problems.

@jbqubit as explained above the period is already defined by the ADC’s internal clock.

No reasonable amount of computation is ever going to take a timescale that's relevant on thermal timescales. And, in any case, this affects the input-output latency (phase shifts) not the loop period, so would not be helped by changing any clocking.

The jitter here is mainly due to software polling rather than interrupts, so IIRC (but I’d need to check the most recent sources/@astro correct me if I’m wrong) ethernet traffic can cause some non-determinism in the loop but not something that’s relevant on the timescales we care about.

Apologies if this comes across a bit blunt, but please don't push for changes that would increase the design complexity until you've really engaged with how the design works.

@jbqubit as explained above the period is already defined by the ADC’s internal clock. No reasonable amount of computation is ever going to take a timescale that's relevant on thermal timescales. And, in any case, this affects the input-output latency (phase shifts) not the loop period, so would not be helped by changing any clocking. The jitter here is mainly due to software polling rather than interrupts, so IIRC (but I’d need to check the most recent sources/@astro correct me if I’m wrong) ethernet traffic can cause some non-determinism in the loop but not something that’s relevant on the timescales we care about. Apologies if this comes across a bit blunt, but please don't push for changes that would increase the design complexity until you've really engaged with how the design works.

but please don’t push for changes that would increase the design complexity until you’ve really engaged with how the design works.

Agreed. Asking questions helps clarify design intent and spot difference between intent and implementation. :)

Anyway, I think the discussion has resulted in several actionable items that will improve firmware and documentation of interplay between loop filter parameters and timestep.

> but please don’t push for changes that would increase the design complexity until you’ve really engaged with how the design works. Agreed. Asking questions helps clarify design intent and spot difference between intent and implementation. :) Anyway, I think the discussion has resulted in several actionable items that will improve firmware and documentation of interplay between loop filter parameters and timestep.

Agreed. Asking questions helps clarify design intent and spot difference between intent and implementation. :)

Well, that's about my point.

However, shifting to a clock-determined (vs loop execution length) period can avoid future annoying complexity if additional computation is added to the in-loop computation.

Isn't a question. Nor is

Better to choose a well-defined loop filter period and enforce timing by some mechanism.

This would have been a more productive conversation if it had begun with a question like "how is this supposed to work"...

> Agreed. Asking questions helps clarify design intent and spot difference between intent and implementation. :) Well, that's about my point. > However, shifting to a clock-determined (vs loop execution length) period can avoid future annoying complexity if additional computation is added to the in-loop computation. Isn't a question. Nor is > Better to choose a well-defined loop filter period and enforce timing by some mechanism. This would have been a more productive conversation if it had begun with a question like "how is this supposed to work"...

Is this resolved by #22?

Is this resolved by #22?

Implementing the Ziegler Nichols method requires knowledge of the time step. So please add a comment about units for the PID coefficients. Your table from earlier in this Issue listing "Coefficient" and "Unit" was helpful.

Implementing the [Ziegler Nichols method](https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method) requires knowledge of the time step. So please add a comment about units for the PID coefficients. Your table from earlier in this Issue listing "Coefficient" and "Unit" was helpful.

As they are now normalized by the time_delta, the units are 1 s and 1 Hz now.

As they are now normalized by the time_delta, the units are 1 s and 1 Hz now.

Did you try out the PID selection algorithm using your test setup? #32

Did you try out the PID selection algorithm using your test setup? #32

Do these methods actually work well in our case, where heating is faster (more efficient) than cooling? @hartytp

Do these methods actually work well in our case, where heating is faster (more efficient) than cooling? @hartytp
sb10q changed title from loop servo time step to loop servo time step / PID autotune 2020-12-28 17:49:56 +08:00
topquark12 was assigned by sb10q 2020-12-28 17:50:01 +08:00

Here are some preliminary results of auto tune on our particular setup. The https://github.com/hirschmann/pid-autotune library is adapted to directly interface with the thermostat through the network and obtain auto-tuned parameters with minimal user input.

I have attatched the results using my manually tuned PID parameters and from the auto tuned PID parameters using the ziegler-nichols rule.

Paying closer attention to the time scale, the manual tuned control loop has a faster response and over shoots less, especially when cooling.

However, both loops are very stable once the temperature has settled. Also, there are a few other auto tuning rules, so there might be other auto tune rules that would work better than what I have shown at this moment.

I'll show more results after running a few more test.

Here are some preliminary results of auto tune on our particular setup. The https://github.com/hirschmann/pid-autotune library is adapted to directly interface with the thermostat through the network and obtain auto-tuned parameters with minimal user input. I have attatched the results using my manually tuned PID parameters and from the auto tuned PID parameters using the ziegler-nichols rule. Paying closer attention to the time scale, the manual tuned control loop has a faster response and over shoots less, especially when cooling. However, both loops are very stable once the temperature has settled. Also, there are a few other auto tuning rules, so there might be other auto tune rules that would work better than what I have shown at this moment. I'll show more results after running a few more test.

I have attached a snapshot showing the temperature stability of our test setup, with PID parameters directly from the autotune tool.

The previous results were obtained on a setup with 3 stacked TEC elements, with the one right beneath the block being cooled controlled by the thermostat, and the bottom 2 plates connected to a constant voltage source. This result was obtained on a setup with changes suggested by @sb10q , where all three TEC are connected to one thermostat channel, with the bottom TEC in parallel with the top two in series.

The parameters are calculated with the Pessen Integral Rule, temperature stability is within +-1.5mK over 5 minutes.

I have attached a snapshot showing the temperature stability of our test setup, with PID parameters directly from the autotune tool. The previous results were obtained on a setup with 3 stacked TEC elements, with the one right beneath the block being cooled controlled by the thermostat, and the bottom 2 plates connected to a constant voltage source. This result was obtained on a setup with changes suggested by @sb10q , where all three TEC are connected to one thermostat channel, with the bottom TEC in parallel with the top two in series. The parameters are calculated with the Pessen Integral Rule, temperature stability is within +-1.5mK over 5 minutes.

Autotuning utility was merged into master in commit 73dd6d9154 , providing automatically calculated Ku, Tu and a set of recommended PID parameters.

Documentation on the usage of this tool and pytec will be added soon.

Autotuning utility was merged into master in commit 73dd6d9154 , providing automatically calculated Ku, Tu and a set of recommended PID parameters. Documentation on the usage of this tool and pytec will be added soon.
Sign in to join this conversation.
No Label
No Milestone
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/thermostat#11
There is no content yet.