KeyError on serving exception #283

Closed
opened 2022-04-27 17:37:07 +08:00 by mwojcik · 7 comments
Member

When trying two standalone Urukuls (since I was trying suservo anyway) at the same time I got this (not all exception messages ported?)

*** Testing Urukul DDSes.
urukul0_cpld: initializing CPLD...
urukul0_cpld: testing attenuator digital control...
urukul0_cpld: done
urukul1_cpld: initializing CPLD...
Traceback (most recent call last):
  File "/nix/store/bvjmxskk4z6qjawc3zamhi26bc7sbz2j-python3.9-artiq-8.0.unknown.beta/bin/.artiq_sinara_tester-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 808, in main
    experiment.run(tests)
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 759, in run
    getattr(self, f"test_{name}")()
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 306, in test_urukuls
    self.init_urukul(cpld)
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/language/core.py", line 79, in run_on_core
    self.core.run(fake_method, args, kwargs)
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/core.py", line 92, in run
    self.comm.serve(embedding_map, symbolizer)
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 821, in serve
    self._serve_exception(embedding_map, symbolizer)
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 765, in _serve_exception
    name = embedding_map.retrieve_str(self._read_int32())
  File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/language/embedding_map.py", line 60, in retrieve_str
    return self.string_map[key]
KeyError: 11

When trying two standalone Urukuls (since I was trying suservo anyway) at the same time I got this (not all exception messages ported?) ``` *** Testing Urukul DDSes. urukul0_cpld: initializing CPLD... urukul0_cpld: testing attenuator digital control... urukul0_cpld: done urukul1_cpld: initializing CPLD... Traceback (most recent call last): File "/nix/store/bvjmxskk4z6qjawc3zamhi26bc7sbz2j-python3.9-artiq-8.0.unknown.beta/bin/.artiq_sinara_tester-wrapped", line 9, in <module> sys.exit(main()) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 808, in main experiment.run(tests) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 759, in run getattr(self, f"test_{name}")() File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/frontend/artiq_sinara_tester.py", line 306, in test_urukuls self.init_urukul(cpld) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/language/core.py", line 79, in run_on_core self.core.run(fake_method, args, kwargs) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/core.py", line 92, in run self.comm.serve(embedding_map, symbolizer) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 821, in serve self._serve_exception(embedding_map, symbolizer) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 765, in _serve_exception name = embedding_map.retrieve_str(self._read_int32()) File "/nix/store/m3lyd2cgcai2wl4019i2m00n0apsr3ds-python3-3.9.12-env/lib/python3.9/site-packages/artiq/language/embedding_map.py", line 60, in retrieve_str return self.string_map[key] KeyError: 11 ```
Owner

Seems ValueError needs to be added to EXCEPTION_ID_LOOKUP in eh_artiq.rs and artiq/language/embedding_map.py.

Seems ValueError needs to be added to EXCEPTION_ID_LOOKUP in eh_artiq.rs and artiq/language/embedding_map.py.
Owner

comm_kernel has those:

if exn_type in (ZeroDivisionError, ValueError, IndexError, RuntimeError) or \
                        hasattr(exn, "artiq_builtin"):

Should also be kept consistent...

comm_kernel has those: ``` if exn_type in (ZeroDivisionError, ValueError, IndexError, RuntimeError) or \ hasattr(exn, "artiq_builtin"): ``` Should also be kept consistent...
sb10q added this to the Alpha milestone 2022-04-27 17:49:27 +08:00
Collaborator

May I ask how can I reproduce this issue? I tried to reproduce the issue to see if this branch can help, but cannot seem to reproduce the issue by artiq_sinata_tester -o urukuls, with this change:

diff --git a/artiq/coredevice/urukul.py b/artiq/coredevice/urukul.py
index d4e04ca7..0c230189 100644
--- a/artiq/coredevice/urukul.py
+++ b/artiq/coredevice/urukul.py
@@ -227,7 +227,7 @@ class CPLD:
         self.bus.set_config_mu(SPI_CONFIG | SPI_END | SPI_INPUT, 24,
                                SPIT_CFG_RD, CS_CFG)
         self.bus.write(self.cfg_reg << 8)
-        return self.bus.read()
+        return 0

     @kernel
     def init(self, blind: bool = False):

without the modification the initialization seems will just stuck:

****** Sinara system tester ******

Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions
*** Testing Urukul DDSes.
urukul0_cpld: initializing CPLD...
May I ask how can I reproduce this issue? I tried to reproduce the issue to see if [this branch](https://git.m-labs.hk/M-Labs/nac3/src/branch/272-key-err-rpc-fix) can help, but cannot seem to reproduce the issue by `artiq_sinata_tester -o urukuls`, with this change: ```diff diff --git a/artiq/coredevice/urukul.py b/artiq/coredevice/urukul.py index d4e04ca7..0c230189 100644 --- a/artiq/coredevice/urukul.py +++ b/artiq/coredevice/urukul.py @@ -227,7 +227,7 @@ class CPLD: self.bus.set_config_mu(SPI_CONFIG | SPI_END | SPI_INPUT, 24, SPIT_CFG_RD, CS_CFG) self.bus.write(self.cfg_reg << 8) - return self.bus.read() + return 0 @kernel def init(self, blind: bool = False): ``` without the modification the initialization seems will just stuck: ``` ****** Sinara system tester ****** Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions *** Testing Urukul DDSes. urukul0_cpld: initializing CPLD... ```
Owner

without the modification the initialization seems will just stuck:

On what system? Obviously you need a device with a Urukul properly set up.

> without the modification the initialization seems will just stuck: On what system? Obviously you need a device with a Urukul properly set up.
Owner

And I suspect the easiest way to reproduce this is simply to raise ValueError, without touching Urukul.

And I suspect the easiest way to reproduce this is simply to raise ValueError, without touching Urukul.
Collaborator

On what system? Obviously you need a device with a Urukul properly set up.

Thanks, I just tested in zeus using artiq_sinara_tester -o urukuls.

Yes I think the reason the unmodified test is stuck is because I do not have an urukul device set up properly. The system stuck at return self.bus.read().

With the above mentioned modified code (which I tried to keep the as much original code as possible), and the modification below (another try):

diff --git a/artiq/coredevice/urukul.py b/artiq/coredevice/urukul.py
index d4e04ca7..1a8aa4b6 100644
--- a/artiq/coredevice/urukul.py
+++ b/artiq/coredevice/urukul.py
@@ -227,7 +227,7 @@ class CPLD:
         self.bus.set_config_mu(SPI_CONFIG | SPI_END | SPI_INPUT, 24,
                                SPIT_CFG_RD, CS_CFG)
         self.bus.write(self.cfg_reg << 8)
-        return self.bus.read()
+        return self.att_to_mu(1.0)

     @kernel
     def init(self, blind: bool = False):

the ValueError can be raised normally:

****** Sinara system tester ******

Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions
*** Testing Urukul DDSes.
urukul0_cpld: initializing CPLD...
Traceback (most recent call last):
  File "/nix/store/9a6m1l6x11xfjxpx23kdi21kyl42z1wa-python3.9-artiq-8.0.unknown.beta/bin/.artiq_sinara_tester-wrapped", line 9, in <module>
    sys.exit(main())
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 808, in main
    experiment.run(tests)
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 759, in run
    getattr(self, f"test_{name}")()
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 306, in test_urukuls
    self.init_urukul(cpld)
  File "/home/ychenfo/code/artiq/artiq/language/core.py", line 79, in run_on_core
    return self.core.run(fake_method, args, kwargs)
  File "/home/ychenfo/code/artiq/artiq/coredevice/core.py", line 92, in run
    self.comm.serve(embedding_map, symbolizer)
  File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 821, in serve
    self._serve_exception(embedding_map, symbolizer)
  File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 803, in _serve_exception
    raise python_exn
ValueError: Urukul proto_rev mismatch

Simply raising a ValueError directly also cannot reproduce the bug:

from artiq.experiment import *
from artiq.coredevice.core import Core

@nac3
class RPCFail(EnvExperiment):
    core: KernelInvariant[Core]

    def build(self):
        self.setattr_device("core")

    @kernel
    def run(self):
        if False:
            raise ValueError("s")
        else:
            self.f()
            raise ValueError("p")

    @kernel
    def f(self):
        if False:
            raise ValueError("s")
        else:
            raise ValueError("p")

gives:

ychenfo@zeus ~/c/a/a/e/n/rpc_key_err (nac3)> artiq_run ./exception_err.py                                   

WARNING:artiq.coredevice.comm_kernel:Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions
Core Device Traceback:
Traceback (most recent call first):
  File "exception_err.py", line 26, column 14, in artiq_run_exception_err.RPCFail.f.0
    raise ValueError("p")
     ^
  File "<nac3_synthesized_modinit>", line 2, in __modinit__ (RA=+0xe4)
    <unknown>
  File "exception_err.py", line 18, in artiq_run_exception_err.RPCFail.run.0 (RA=+0x174)
    self.f()
  File "exception_err.py", line 26, in artiq_run_exception_err.RPCFail.f.0 (RA=+0x164)
    raise ValueError("p")
ValueError(0): p

End of Core Device Traceback

Traceback (most recent call last):
  File "/nix/store/9a6m1l6x11xfjxpx23kdi21kyl42z1wa-python3.9-artiq-8.0.unknown.beta/bin/.artiq_run-wrapped", line 9, in <module>
    sys.exit(main())
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 171, in main
    return run(with_file=True)
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 157, in run
    raise exn
  File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 152, in run
    exp_inst.run()
  File "/home/ychenfo/code/artiq/artiq/language/core.py", line 79, in run_on_core
    return self.core.run(fake_method, args, kwargs)
  File "/home/ychenfo/code/artiq/artiq/coredevice/core.py", line 92, in run
    self.comm.serve(embedding_map, symbolizer)
  File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 821, in serve
    self._serve_exception(embedding_map, symbolizer)
  File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 803, in _serve_exception
    raise python_exn
ValueError: p
> On what system? Obviously you need a device with a Urukul properly set up. Thanks, I just tested in zeus using `artiq_sinara_tester -o urukuls`. Yes I think the reason the unmodified test is stuck is because I do not have an urukul device set up properly. The system stuck at [`return self.bus.read()`](https://github.com/m-labs/artiq/blob/1b348dc0025272d88ec19d57669434b0210983dd/artiq/coredevice/urukul.py#L230). With the above mentioned modified code (which I tried to keep the as much original code as possible), and the modification below (another try): ```diff diff --git a/artiq/coredevice/urukul.py b/artiq/coredevice/urukul.py index d4e04ca7..1a8aa4b6 100644 --- a/artiq/coredevice/urukul.py +++ b/artiq/coredevice/urukul.py @@ -227,7 +227,7 @@ class CPLD: self.bus.set_config_mu(SPI_CONFIG | SPI_END | SPI_INPUT, 24, SPIT_CFG_RD, CS_CFG) self.bus.write(self.cfg_reg << 8) - return self.bus.read() + return self.att_to_mu(1.0) @kernel def init(self, blind: bool = False): ``` the `ValueError` can be raised normally: ``` ****** Sinara system tester ****** Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions *** Testing Urukul DDSes. urukul0_cpld: initializing CPLD... Traceback (most recent call last): File "/nix/store/9a6m1l6x11xfjxpx23kdi21kyl42z1wa-python3.9-artiq-8.0.unknown.beta/bin/.artiq_sinara_tester-wrapped", line 9, in <module> sys.exit(main()) File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 808, in main experiment.run(tests) File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 759, in run getattr(self, f"test_{name}")() File "/home/ychenfo/code/artiq/artiq/frontend/artiq_sinara_tester.py", line 306, in test_urukuls self.init_urukul(cpld) File "/home/ychenfo/code/artiq/artiq/language/core.py", line 79, in run_on_core return self.core.run(fake_method, args, kwargs) File "/home/ychenfo/code/artiq/artiq/coredevice/core.py", line 92, in run self.comm.serve(embedding_map, symbolizer) File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 821, in serve self._serve_exception(embedding_map, symbolizer) File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 803, in _serve_exception raise python_exn ValueError: Urukul proto_rev mismatch ``` --- Simply raising a `ValueError ` directly also cannot reproduce the bug: ```python from artiq.experiment import * from artiq.coredevice.core import Core @nac3 class RPCFail(EnvExperiment): core: KernelInvariant[Core] def build(self): self.setattr_device("core") @kernel def run(self): if False: raise ValueError("s") else: self.f() raise ValueError("p") @kernel def f(self): if False: raise ValueError("s") else: raise ValueError("p") ``` gives: ``` ychenfo@zeus ~/c/a/a/e/n/rpc_key_err (nac3)> artiq_run ./exception_err.py WARNING:artiq.coredevice.comm_kernel:Mismatch between gateware (7.8047.16393ef.beta) and software (8.0.beta) versions Core Device Traceback: Traceback (most recent call first): File "exception_err.py", line 26, column 14, in artiq_run_exception_err.RPCFail.f.0 raise ValueError("p") ^ File "<nac3_synthesized_modinit>", line 2, in __modinit__ (RA=+0xe4) <unknown> File "exception_err.py", line 18, in artiq_run_exception_err.RPCFail.run.0 (RA=+0x174) self.f() File "exception_err.py", line 26, in artiq_run_exception_err.RPCFail.f.0 (RA=+0x164) raise ValueError("p") ValueError(0): p End of Core Device Traceback Traceback (most recent call last): File "/nix/store/9a6m1l6x11xfjxpx23kdi21kyl42z1wa-python3.9-artiq-8.0.unknown.beta/bin/.artiq_run-wrapped", line 9, in <module> sys.exit(main()) File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 171, in main return run(with_file=True) File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 157, in run raise exn File "/home/ychenfo/code/artiq/artiq/frontend/artiq_run.py", line 152, in run exp_inst.run() File "/home/ychenfo/code/artiq/artiq/language/core.py", line 79, in run_on_core return self.core.run(fake_method, args, kwargs) File "/home/ychenfo/code/artiq/artiq/coredevice/core.py", line 92, in run self.comm.serve(embedding_map, symbolizer) File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 821, in serve self._serve_exception(embedding_map, symbolizer) File "/home/ychenfo/code/artiq/artiq/coredevice/comm_kernel.py", line 803, in _serve_exception raise python_exn ValueError: p ```
Owner

Thanks, I just tested in zeus using artiq_sinara_tester -o urukuls.

Using what device db and what core device?
You need a device db for Kasli or Kasli-SoC, with the corresponding Urukul entries, and the core device hardware set up accordingly. KC705 and ZC706 do not support Urukul.

> Thanks, I just tested in zeus using artiq_sinara_tester -o urukuls. Using what device db and what core device? You need a device db for Kasli or Kasli-SoC, with the corresponding Urukul entries, and the core device hardware set up accordingly. KC705 and ZC706 do not support Urukul.
sb10q closed this issue 2023-11-01 13:28:51 +08:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: M-Labs/nac3#283
No description provided.