Sebastien Bourdeauducq
3fd6962bd2
use sipyco ( #585 )
2019-11-10 15:55:17 +08:00
Chris Ballance
8659c769cb
master/language: add methods to set experiment pipeline/priority/flush defaults
2019-03-12 10:54:15 +01:00
David Nadlinger
0dab7ecd73
master: Include RID in worker exception messages
...
This helps when debugging unexpected shutdown problems
after the fact.
2019-01-20 19:45:50 +00:00
Sebastien Bourdeauducq
387688354c
master: optimize repository scan, closes #546
2016-09-09 19:19:01 +08:00
Sebastien Bourdeauducq
4c8a8357b0
worker: increase send_timeout (Windows can be really slow)
2016-07-03 12:18:34 +08:00
Sebastien Bourdeauducq
aa61c29efb
transfer Python builtin exceptions over pc_rpc and master/worker
2016-04-04 22:02:42 +08:00
Robert Jördens
fef72506e4
ctlmgr/gui/master: start subprocesses in new pgroup
...
This only makes a difference on POSIX. It prevents subprocesses
from receiving the signals that the parent receives. For ctlmgr
and master is cuts down on spam on the console (KeyboardInterrupt
tracebacks from all controllers) and enforces that proper
termination is followed.
This does not help if the parent gets SIGKILL (subprocesses
may linger).
2016-02-18 23:51:12 +01:00
Sebastien Bourdeauducq
155c2ec2ef
ctlmgr,worker: set PYTHONUNBUFFERED for subprocesses
2016-02-18 12:41:08 +01:00
Sebastien Bourdeauducq
6196aaf2f5
master/worker: increase timeouts. Windows VMs can be really slow.
2016-02-16 09:44:50 +01:00
Robert Jördens
53e5d0a7bb
worker: flake8 style cleanup
2016-02-02 15:32:40 -07:00
Robert Jördens
55006119c8
subprocesses: unify termination logic
2016-02-02 15:32:36 -07:00
Sebastien Bourdeauducq
5076c85ed6
worker: Windows VMs are slow, increase send_timeout
2016-01-27 21:15:22 +01:00
Sebastien Bourdeauducq
5aa4de8e89
refactor logging and implement in worker
2016-01-26 20:31:42 +01:00
Sebastien Bourdeauducq
a583a923d8
worker: use pipe_ipc (no log)
2016-01-26 14:59:36 +01:00
Sebastien Bourdeauducq
ae19f1c75d
master: add filename in worker log entries. Closes #226
2016-01-23 21:43:24 -05:00
Sebastien Bourdeauducq
cc6b808bf8
master: finer control of worker exception reporting. Closes #233
2016-01-23 21:23:02 -05:00
whitequark
6bf48e60ba
worker: make parent errors readable in log.
2016-01-16 02:06:40 +00:00
Sebastien Bourdeauducq
8467013160
master,gui: support recomputation+reset of arguments
2015-12-06 17:27:15 +08:00
Sebastien Bourdeauducq
32c95f24d0
worker: reduce some logging levels
2015-10-29 09:34:41 +08:00
Sebastien Bourdeauducq
0d53f7ab0d
ignore ProcessLookupError when killing subprocesses. Closes #167
2015-10-28 20:57:28 +08:00
Sebastien Bourdeauducq
1ada15ae5d
master: simplify worker/parent RPC
2015-10-28 17:35:57 +08:00
Sebastien Bourdeauducq
d13b368a65
build logging into worker
2015-10-20 18:11:50 +08:00
Sebastien Bourdeauducq
1d14975bd5
worker: cleaner termination on exception in user code, improve unittest
2015-10-13 01:11:57 +08:00
Sebastien Bourdeauducq
139072d402
Graceful experiment termination. Closes #76
2015-10-06 13:50:00 +08:00
Sebastien Bourdeauducq
b3584bc190
language,master,run: support raw access to DDB from experiments. Closes #123
2015-10-04 18:29:39 +08:00
Sebastien Bourdeauducq
f552d62b69
use Python 3.5 coroutines
2015-10-03 19:28:57 +08:00
Sebastien Bourdeauducq
125503139e
remove workaround for Python bug in asyncio process.wait(). Requires Python 3.5. Closes #58
2015-10-03 14:33:18 +08:00
Sebastien Bourdeauducq
7ed8fe57fa
Git support
2015-08-07 15:51:56 +08:00
Sebastien Bourdeauducq
8402f1cdcd
master,gui: basic log support
2015-07-22 05:13:50 +08:00
Sebastien Bourdeauducq
9ed4dcd7d1
repository: load experiments in worker, list arguments
2015-07-15 10:54:44 +02:00
Sebastien Bourdeauducq
7770ab64f2
worker: factor timeouts
2015-07-14 23:43:08 +02:00
Sebastien Bourdeauducq
96a5d73c81
worker: split build stage from prepare
2015-07-09 13:18:12 +02:00
Sebastien Bourdeauducq
c71fe29792
simplify unit system and use floats by default
2015-06-26 16:34:37 +02:00
Sebastien Bourdeauducq
a6a476593e
worker: wait for process termination
...
This prevents stray SIGCHLDs from crashing the program e.g. if the asyncio event loop is closed before the process actually terminates.
2015-06-05 00:37:26 +08:00
Sebastien Bourdeauducq
c843c353d7
worker: remove useless process wait
2015-06-05 00:05:38 +08:00
Yann Sionneau
60bdf74137
tests: use try/finally to close event loop + wait for process to die after killing it
2015-06-04 13:40:13 +02:00
Sebastien Bourdeauducq
78f9268277
worker: add note about correct use of close()
2015-06-04 11:30:34 +08:00
Sebastien Bourdeauducq
fc449509b8
scheduler: pass priority to experiments
2015-05-24 20:37:47 +08:00
Sebastien Bourdeauducq
b74b8d5826
Scheduling TNG
2015-05-17 16:11:00 +08:00
Sebastien Bourdeauducq
43a05c783d
worker: split write_results action
2015-03-11 19:06:46 +01:00
Sebastien Bourdeauducq
d5795fd619
master: watchdog support
...
Introduces a watchdog context manager to use in the experiment code that
terminates the process with an error if it times out. The syntax is:
with self.scheduler.watchdog(20*s):
...
Watchdogs timers are implemented by the master process (and the worker
communicates the necessary information about them) so that they can be
enforced even if the worker crashes. They can be nested arbitrarily.
During yields, all watchdog timers for the yielding worker are
suspended [TODO]. Setting up watchdogs is not supported in kernels,
however, a kernel can be called within watchdog contexts (and terminating
the worker will terminate the kernel [TODO]).
It is possible to implement a heartbeat mechanism using a watchdog, e.g.:
for i in range(...):
with self.scheduler.watchdog(...):
....
Crashes/freezes within the iterator or the loop management would not be
detected, but they should be rare enough.
2015-03-11 16:43:14 +01:00
Sebastien Bourdeauducq
f2134fa4b2
master,worker: split prepare/run/analyze
2015-03-09 23:34:09 +01:00
Sebastien Bourdeauducq
4c280d5fcc
master: use a new worker process for each experiment
2015-03-09 16:22:41 +01:00
Sebastien Bourdeauducq
ec1d082730
remove timeout from run_params (to be replaced by a better mechanism)
2015-03-09 10:51:32 +01:00
Sebastien Bourdeauducq
cc172699ea
master: use RID + unit class name for HDF5 filenames
2015-02-20 14:11:55 -07:00
Sebastien Bourdeauducq
4d21b78314
master,client,gui: factor timeout into run_params
2015-02-19 20:03:55 -07:00
Sebastien Bourdeauducq
c69c4d5ce9
master: expose scheduler API to experiments
2015-02-19 12:09:11 -07:00
Sebastien Bourdeauducq
3e22fe86b5
reorganize files as per discussion with Robert
2015-01-17 19:38:20 +08:00
Sebastien Bourdeauducq
070788a680
separate master modules
2015-01-14 12:16:49 +08:00