Kasli-SoC v1.0.1: DRTIO failure on some boards #216

Closed
opened 2023-02-17 14:35:15 +08:00 by mwojcik · 1 comment

I noticed this while working on clocking changes. Thinking it was a problem with my changes, I kept at it, but eventually grabbed another one and it was OK, suggesting a potential hardware issue.

Quick diagnostics with:

  • SD card with known good DRTIO master firmware (release-7, recently shipped in a crate that tested OK)
  • Kasli 1.1 permanent satellite (known good, used for other tests)
  • copper cable (not affected by #213)

Test procedure:

  • insert SD card into Kasli-SoC, switch boot mode to SD
  • connect Kasli-SoC to satellite, SFP0 to SFP0
  • verify device logs to see that it booted correctly, found the link, and established communication with the satellite.

Tried three Kasli-SoCs.

Symptoms with Kasli-SoC #0244/22 0023:

[     4.675166s]  INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up
[    10.710981s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging
[    30.118942s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] ping failed
[    30.325941s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging
[    50.332901s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] ping failed
[    50.539901s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging

(goes on forever)

#0244/22 0009:

[     4.511170s]  INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up

(does not even detect link)

#0244/22 0013:

[     5.137179s]  INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up
[     5.343992s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging
[    11.751977s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] remote replied after 32 packets
[    11.788801s]  INFO(runtime::rtio_mgt::drtio): [LINK#0] link initialization completed
[    11.796585s]  INFO(runtime::rtio_mgt::drtio): [DEST#1] destination is up
[    11.803196s]  INFO(runtime::rtio_mgt::drtio): [DEST#1] buffer space is 128

(OK)

I noticed this while working on clocking changes. Thinking it was a problem with my changes, I kept at it, but eventually grabbed another one and it was OK, suggesting a potential hardware issue. Quick diagnostics with: * SD card with known good DRTIO master firmware (release-7, recently shipped in a crate that tested OK) * Kasli 1.1 permanent satellite (known good, used for other tests) * copper cable (not affected by #213) Test procedure: * insert SD card into Kasli-SoC, switch boot mode to SD * connect Kasli-SoC to satellite, SFP0 to SFP0 * verify device logs to see that it booted correctly, found the link, and established communication with the satellite. Tried three Kasli-SoCs. Symptoms with Kasli-SoC #0244/22 0023: ``` [ 4.675166s] INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up [ 10.710981s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging [ 30.118942s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] ping failed [ 30.325941s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging [ 50.332901s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] ping failed [ 50.539901s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging ``` (goes on forever) #0244/22 0009: ``` [ 4.511170s] INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up ``` (does not even detect link) \#0244/22 0013: ``` [ 5.137179s] INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up [ 5.343992s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging [ 11.751977s] INFO(runtime::rtio_mgt::drtio): [LINK#0] remote replied after 32 packets [ 11.788801s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link initialization completed [ 11.796585s] INFO(runtime::rtio_mgt::drtio): [DEST#1] destination is up [ 11.803196s] INFO(runtime::rtio_mgt::drtio): [DEST#1] buffer space is 128 ``` (OK)
Poster
Owner

@den512 looked at them and they will be returned - random hardware failure rather than a systematic problem.

@den512 looked at them and they will be returned - random hardware failure rather than a systematic problem.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/artiq-zynq#216
There is no content yet.