Commit Graph

52 Commits (b89584632225f8cf10e6557b001d80eee45f93b4)

Author SHA1 Message Date
Michael Birtwell b895846322 Improve exception reports when exception can't be reconstructed
Artiq assumes that all exceptions raised by the kernel can be constructed with
a single string argument. This isn't always the case. Especially for
exceptions that originated in python and were propagated to the kernel over
rpc.

With out this change a mosek solver failure looks like:
```
ERROR    root:logging_tools.py:41 Terminating with exception (TypeError: __init__() missing 1 required positional argument: 'msg')
Traceback (most recent call last):
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/master/worker_impl.py", line 540, in main
    exp_inst.run()
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/test_tools/experiment.py", line 82, in wrapper
    meth()
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/language/core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/coredevice/core.py", line 152, in run
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/coredevice/comm_kernel.py", line 720, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "/home/mb/.cache/pypoetry/virtualenvs/ion-transport-1-b41LI0-py3.8/lib/python3.8/site-packages/artiq/coredevice/comm_kernel.py", line 699, in _serve_exception
    python_exn = python_exn_type(
TypeError: __init__() missing 1 required positional argument: 'msg'
```

With this change we get:
```
ERROR    root:logging_tools.py:41 Terminating with exception (RuntimeError: Exception type=<class 'mosek.Error'>, which couldn't be reconstructed (__init__() missing 1 required positional argument: 'msg'))
Core Device Traceback:
Traceback (most recent call first):
  File "/home/mb/oxionics/ion-transport/tests/test_end_to_end.py", line 280, in get_transport
    return self.seq.solve()
  File "/home/mb/oxionics/ion-transport/tests/test_end_to_end.py", line 288, in artiq_worker_test_end_to_end.TransportTestScan.run(..., ...) (RA=+0x2e4)
    self.seq.record(self.get_transport(1e-6 + 1e-7 * x))
mosek.Error(27): rescode.err_license_expired(1001): The license has expired.

End of Core Device Traceback
Traceback (most recent call last):
  File "/home/mb/oxionics/artiq/artiq/master/worker_impl.py", line 540, in main
    exp_inst.run()
  File "/home/mb/oxionics/artiq/artiq/test_tools/experiment.py", line 82, in wrapper
    meth()
  File "/home/mb/oxionics/artiq/artiq/language/core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "/home/mb/oxionics/artiq/artiq/coredevice/core.py", line 152, in run
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "/home/mb/oxionics/artiq/artiq/coredevice/comm_kernel.py", line 732, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "/home/mb/oxionics/artiq/artiq/coredevice/comm_kernel.py", line 714, in _serve_exception
    raise python_exn
RuntimeError: Exception type=<class 'mosek.Error'>, which couldn't be reconstructed (__init__() missing 1 required positional argument: 'msg')
```

Signed-off-by: Michael Birtwell <michael.birtwell@oxionics.com>
2022-09-26 20:25:13 +08:00
Michael Birtwell 69c4026d2b Fix returning tuples of lists of arrays from RPCs
When serialising a list of objects `_send_rpc_value` makes a copy of the
upcoming tags to pass repeatedly to the recursive call. Then uses
`_skip_rpc_value` to skip over the tags that should have been processed.
This didn't handle numpy arrays so, after processing a list of arrays it
got out of sync and failed.

Signed-off-by: Michael Birtwell <michael.birtwell@oxionics.com>
2022-06-15 00:08:49 +08:00
Michael Birtwell cab9d90d01 Use sipyco.keepalive
Remove the implementation of setting keepalive settings on sockets and use
the implementation from sipyco instead.

Signed-off-by: Michael Birtwell <michael.birtwell@oxionics.com>
2022-04-12 08:53:35 +08:00
Sebastien Bourdeauducq 1a26eb8cf2 coredevice: only print version mismatch warning when relevant 2022-03-16 21:21:43 +08:00
pca006132 ba34700798 coredevice: report nested exceptions 2022-01-26 07:16:54 +08:00
pca006132 4644e105b1 compiler: modified exception representation
Exception name is replaced by exception ID, which requires no
allocation. Other strings in the exception can now be 'host-only'
strings, which is represented by a CSlice with len = usize::MAX and
ptr = key, to avoid the need for allocation when raising exceptions
through RPC.
2022-01-26 07:16:54 +08:00
Steve Fan 3f812c4c2c
comm_kernel: fix RPC exception handling (#1801) 2022-01-12 15:23:37 +08:00
Steve Fan de5892a00a
comm_kernel: check if elements are within bounds for RPC list (#1824) 2022-01-11 17:16:45 +08:00
Steve Fan d7dd75e833 comm_kernel: fix off-by-one error for numeric value range check 2022-01-11 10:13:42 +08:00
Steve Fan 4a6bea479a
Host report for async error upon kernel termination (#1791)
Closes #1644
2021-12-04 13:33:24 +08:00
pca006132 4c743cf8af revert busy polling 2021-05-23 14:07:11 +08:00
pca006132 1e9a131386 coredevice.comm_kernel: performance improvement
reduced latency by busy polling, and improved byte list performance.
2021-05-23 13:30:00 +08:00
Peter Drmota 47bf5d36af coredevice.comm_kernel: Fix unpacking of lists of numpy.int64
test.coredevice.test_embedding: Add tests for list of numpy.int64
2021-04-21 15:46:58 +01:00
David Nadlinger 888696f588 coredevice: Fix RPC typing for bool lists/arrays
GitHub: Fixes #1635.
2021-03-20 00:03:10 +00:00
David Nadlinger f9872bb7b8 coredevice: Handle prematurely closed sockets in comm_kernel receive loop
recv() returns 0 instead of data if the socket has already
been closed. This is translated into a zero-length list on
the Python layer. Previously, the code would enter an
infinite loop if the socket was closed while attempting
to receive data.
2021-01-26 18:10:49 +08:00
David Nadlinger f1fd42ea98 coredevice: Re-enable TCP keepalive
This partially reverts commit b5e1bd3fa2,
which had removed keepalive. This, however, led to experiments
hanging forever if the core device had dropped the connection
(e.g. to a kernel CPU panic, or the device being rebooted).

The chosen keepalive settings are fairly conservative (with the
10 s timeout) to avoid any possible interaction with smoltcp's
3 s ARP try interval (see GitHub issue #1150), even though this
should be a non-issue now due to the larger ARP cache.
2021-01-26 18:10:49 +08:00
pca006132 8148fdb8a7
use device endian for core device protocols (#1591) 2021-01-22 16:33:21 +08:00
pca006132 ec72eeda46 coredevice: use device endian for kernel and RPC 2021-01-21 09:07:48 +01:00
Sebastien Bourdeauducq 1ce505c547 coredevice: remove obsolete watchdog code (#1458) 2020-12-08 13:25:39 +08:00
pca006132 6195b1d3a0 rpc: fixed _write_bool
Closes #1519
2020-09-04 13:49:22 +08:00
pca006132 b2572003ac RPC: optimization by caching
This reduced the calls needed for socket send/recv.
2020-08-28 14:58:34 +08:00
Sebastien Bourdeauducq 7cf974a6a7 comm_kernel: fix typo 2020-08-28 12:25:23 +08:00
pca006132 7181ff66a6 compiler: improved rpc performance for list and array
1. Removed duplicated tags before each elements.
2. Use numpy functions to speedup parsing.
2020-08-26 14:17:06 +08:00
David Nadlinger 8783ba2072 compiler/firmware: RPCs for ndarrays 2020-08-09 17:08:43 +01:00
David Nadlinger 9af6e5747d compiler: Factor rpc_tag() out of llvm_ir_generator 2020-08-09 03:54:41 +01:00
Sebastien Bourdeauducq e8b73876ab comm_kernel: add Zynq runtime identifier 2020-04-12 17:25:14 +08:00
Sebastien Bourdeauducq de57039e6e comm_kernel: cleanup 2020-04-12 16:02:36 +08:00
Sebastien Bourdeauducq 9dc24f255e comm_kernel: remove dead code 2020-04-12 15:06:46 +08:00
Sebastien Bourdeauducq b5e1bd3fa2 coredevice: simplify/cleanup network connection code
This removes:
* host-side keepalive, which turns out not to be required
* custom connection timeout (the default is OK)
* SSH tunneling support (doesn't seem to be actually used anywhere)
2019-12-23 19:53:49 +08:00
David Nadlinger af31c6ea21 coredevice: Don't use `is` to compare with integer literal
This works on CPython, but is not guaranteed to do so, and
produces a warning since 3.8 (see https://bugs.python.org/issue34850).
2019-12-22 05:46:41 +00:00
David Nadlinger fc95183e04 coredevice: Fix host-side serialization of (nested) lists
Test case to follow.
2019-03-31 17:10:27 +01:00
whitequark 49682d0159 Improve Python 3.7 compatibility.
async is now a full (non-contextual) keyword.

There are two more instances:
   - artiq/frontend/artiq_client.py
   - artiq/devices/thorlabs_tcube/driver.py

It is not immediately clear how to fix those, so they are left for
later work.
2019-01-12 13:17:59 +00:00
whitequark a74958f01f ksupport: raise RuntimeError on reraise with no inflight exception.
Fixes #1123.
2018-08-07 05:53:13 +00:00
Sebastien Bourdeauducq b27fa8964b add variant in identifier string
Also add without-sawg suffixes on Sayma.

Closes #1060
Closes #1059
2018-07-15 17:21:17 +08:00
Sebastien Bourdeauducq b10d3ee4b4 make RTIO clock switch optional and simplify
Kasli no longer has an internal RTIO clock.
Switching clocks dynamically is no longer supported.
2018-05-18 17:41:34 +08:00
whitequark 4d06c1d84b coredevice.CommKernel: modernize. NFC. 2018-05-16 14:40:14 +00:00
whitequark ca93b94aea firmware: move config requests to management protocol.
They were only in session protocol because of historical reasons.
2018-05-16 14:32:49 +00:00
whitequark a39f8d6634 artiq_devtool: use CommMgmt instead of shelling out to artiq_coremgmt.
(Which no longer exists.)

This also fixes `artiq_devtool hotswap` to work without
an `artiq_devtool connect` running in background.
2018-05-16 14:08:32 +00:00
whitequark a2bc12da68 Only print gateware/software mismatch warning once per process. 2017-12-26 21:13:46 +00:00
whitequark 31b52ff1b3 coredevice: do not desync kernel TCP stream on RPCReturnValueError.
And propagate the error upwards instead of trying to squeeze it
through the core device in futility.
2017-06-25 04:47:32 +00:00
whitequark fdb24ef139 coredevice: truncate overlong exception messages.
If we have a really long one (megabytes) then this may exhaust
the heap of the core device and crash it.
2017-06-25 04:39:08 +00:00
whitequark 284382b1f5 compiler: add support for bytearray values in RPC (#714). 2017-06-09 07:15:25 +00:00
whitequark 66a683f583 compiler: add support for bytes values in RPC (#714). 2017-06-09 07:10:48 +00:00
Sebastien Bourdeauducq ebb4660207 coredevice: compare software and gateware versions strictly 2017-06-07 02:11:54 +08:00
Sebastien Bourdeauducq bea7a47349 coredevice: compare gateware and software versions on release numbers only. Closes #738 2017-06-05 13:52:24 +08:00
Sebastien Bourdeauducq 5ccca74a3f fold comm device into core device 2017-05-22 15:45:45 +08:00
whitequark 948ed6fb0a Extract core device management interface from session interface (#691). 2017-03-29 03:36:21 +00:00
whitequark b643847da5 Fix a misleading message for non-clean kernel termination. 2017-03-28 14:22:20 +00:00
whitequark 795a3e24d8 artiq_coreboot: block until hotswap image is actually received. 2017-03-07 11:36:28 +00:00
whitequark b0cdef3211 firmware: implement hotswap through artiq_coreboot tool. 2017-03-06 16:01:28 +00:00