* Revert "Merge pull request #1544 from airwoodix/dataset-compression"
This reverts commit 311a818a49, reversing
changes made to 7ffe4dc2e3.
* fix accidental revert of f42bea06a8
This breaks the internal dataset representation used by applets
and when saving to disk (``dataset_db.pyon``).
See ``test/test_dataset_db.py`` and ``test/test_datasets.py``
for examples.
Signed-off-by: Etienne Wodey <wodey@iqo.uni-hannover.de>
Resolves error message shown.
The following error message is shown when worker_impl.py:199 is run:
```
WARNING:worker(RID,EXPERIMENT):py.warnings:/nix/store/77sw4p03cb7rdayx86agi4yqxh5wq46b-python3.7-artiq-5.7141.1b68906/lib/python3.7/site-packages/artiq/master/worker_impl.py:199: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
logging.warn(message)
```
See test case – previously, the highest-priority pending run would
be used to calculate the timeout, rather than the earliest one.
This probably managed to go undetected for that long as any unrelated
changes to the pipeline (e.g. new submissions, or experiments pausing)
would also cause _get_run() to be re-evaluated.
Previously, a significant risk of losing experimental results would
be associated with long-running experiments, as any stray exceptions
while run()ing the experiment – for instance, due to infrequent
network glitches or hardware reliability issue – would cause no
HDF5 file to be written. This was especially troublesome as long
experiments would suffer from a higher probability of unanticipated
failures, while at the same time being more costly to re-take in
terms of wall-clock time.
Unanticipated uncaught exceptions like that were enough of an issue
that several Oxford codebases had come up with their own half-baked
mitigation strategies, from swallowing all exceptions in run() by
convention, to always broadcasting all results to uniquely named
datasets such that the partial results could be recovered and written
to HDF5 by manually run recovery experiments.
This commit addresses the problem at its source, changing the worker
behaviour such that an HDF5 file is always written as soon as run()
starts.
From looking at the code, it wasn't obvious to me that this is
supposed to handle multiple calls to delete(). This is the case,
however, when for instance Scheduler.delete()ing a run, which
will then also be deleted again from AnalyzeStage.
This helps debugging the cause of TypeErrors arising from types
not handled by the HDF5 serializer, as the backtrace doesn't
otherwise include any useful information.
The docstrings are quite minimal still, but should already
help with navigating the different layers when getting
accustomed with the code base.
RIDCounter was moved to its own module, as it isn't really
related to the other classes (used from master only).
In larger experiments, it is quite natural for the same dataset
to be read from multiple unrelated components. The only situation
where multiple reads from an archived dataset are problematic is
when the valeu actually changes between reads. Hence, this commit
restricts the warning to the latter situation.