From a2f6d2ed55db3f249f47d5983b540275f8593765 Mon Sep 17 00:00:00 2001 From: architeuthis Date: Fri, 14 Jun 2024 17:38:25 +0800 Subject: [PATCH] doc: 'Management system' manual page copy+clarity edit --- doc/manual/getting_started_mgmt.rst | 13 ++- doc/manual/management_system.rst | 166 +++++++++++++--------------- 2 files changed, 83 insertions(+), 96 deletions(-) diff --git a/doc/manual/getting_started_mgmt.rst b/doc/manual/getting_started_mgmt.rst index 71bf8f38f..272f60eae 100644 --- a/doc/manual/getting_started_mgmt.rst +++ b/doc/manual/getting_started_mgmt.rst @@ -58,6 +58,9 @@ Now, start the dashboard with the following commands in another terminal: :: The dashboard should display the list of experiments from the repository folder in a dock called "Explorer". There should be only the experiment we created. Select it and click "Submit", then look at the "Log" dock for the output from this simple experiment. +.. note:: + You may note that experiments may be submitted with a due date, a priority level, a pipeline identifier, and other specific settings. Some of these are self-explanatory. Many are scheduling-related. For more information on experiment scheduling, especially when submitting longer experiments or submitting across multiple users, see :ref:`experiment-scheduling`. + Adding an argument ------------------ @@ -100,6 +103,8 @@ Trigger a repository rescan and click the button labeled "Recompute all argument While regular arguments are all requested simultaneously before submitting, interactive arguments can be requested at any point. In order to request multiple interactive arguments at once, place them within the same ``with`` block; see also the example ``interactive.py`` in the ``examples/no_hardware`` folder. +.. _master-setting-up-git: + Setting up Git integration -------------------------- @@ -160,9 +165,6 @@ Let's now make a modification to the experiment. In the source present in the wo .. note:: You may also use the "Open file outside repository" feature of the GUI, by right-clicking on the explorer. -.. note:: - Submitting an experiment from the repository using the ``artiq_client`` command-line tool is done using the ``-R`` flag. - Verify the log in the GUI. If you are happy with the result, commit the new version and push it into the master's repository: :: $ cd ~/artiq-work @@ -197,6 +199,9 @@ Commit, push and submit the experiment as before. Go to the "Datasets" dock of t Plotting in the ARTIQ dashboard is achieved by programs called "applets". Applets are independent programs that add simple GUI features and are run as separate processes (to achieve goals of modularity and resilience against poorly written applets). Users may write their own applets, or use those supplied with ARTIQ (in the ``artiq.applets`` module) that cover basic plotting. +.. seealso:: + For writing applets, see also the references provided on :ref:`the management system page` of this manual. + Applets are configured through their command line to select parameters such as the names of the datasets to plot. The list of command-line options can be retrieved using the ``-h`` option; for an example you can run ``python3 -m artiq.applets.plot_xy -h`` in a terminal. In our case, create a new applet from the XY template by right-clicking in the empty applet list, and edit the "Command" field so that it retrieves the ``parabola`` dataset (the line should be ``${artiq_applet}plot_xy parabola``). Run the experiment again, and observe how the points are added one by one to the plot. @@ -224,7 +229,7 @@ To start the controller manager (the master must already be running, and artiq-c $ artiq_ctlmgr -Controllers may be run on a different machine from the master, or even on multiple different machines, alleviating cabling issues and OS compatibility problems. In this case, communication with the master happens over the network. One controller manager is necessary per network node (i.e. machine) that runs controllers. Use the ``-s`` and ``--bind`` flags of ``artiq_ctlmgr`` to set IP addresses or hostnames to connect and bind to. +Controllers may be run on a different machine from the master, or even on multiple different machines, alleviating cabling issues and OS compatibility problems. In this case, communication with the master happens over the network. If multiple machines are running controllers, they must each run their own controller manager. Use the ``-s`` and ``--bind`` flags of ``artiq_ctlmgr`` to set IP addresses or hostnames to connect and bind to. Note, however, that the controller for the particular device you are trying to connect to must first exist and be part of a complete Network Device Support Package, or NDSP. :doc:`Some NDSPs are already available `. If your device is not on this list, the system is designed to make it quite possible to write your own. For this, see the :doc:`developing_a_ndsp` page. diff --git a/doc/manual/management_system.rst b/doc/manual/management_system.rst index 440b27d9d..9e2ab603c 100644 --- a/doc/manual/management_system.rst +++ b/doc/manual/management_system.rst @@ -1,140 +1,122 @@ Management system ================= -The management system described below is optional: experiments can be run one by one using :mod:`~artiq.frontend.artiq_run`, and the controllers can run stand-alone (without a controller manager). For their very first steps with ARTIQ or in simple or particular cases, users do not need to deploy the management system. +.. note:: + The ARTIQ management system as described here is optional. Experiments can be run one-by-one using :mod:`~artiq.frontend.artiq_run`, and controllers can be run without a controller manager. For their very first steps with ARTIQ or in simple or particular cases, users do not need to deploy the management system. For an introduction to the system and how to use it, see :doc:`getting_started_mgmt`. Components -********** +---------- Master ------- +^^^^^^ -The :ref:`master ` is responsible for managing the parameter and device databases, the experiment repository, scheduling and running experiments, archiving results, and distributing real-time results. +The :ref:`ARTIQ master ` is responsible for managing the parameter and device databases, the experiment repository, scheduling and running experiments, archiving results, and distributing real-time results. It is a headless component, and one or several clients (command-line or GUI) use the network to interact with it. It also uses the network to interact with any relevant controller managers. -The master is a headless component, and one or several clients (command-line or GUI) use the network to interact with it. - -Controller manager ------------------- - -Controller managers (started using the ``artiq_ctlmgr`` command that is part of the ``artiq-comtools`` package) are responsible for running and stopping controllers on a machine. There is one controller manager per network node that runs controllers. - -A controller manager connects to the master and uses the device database to determine what controllers need to be run. Changes in the device database are tracked by the manager and controllers are started and stopped accordingly. - -Controller managers use the local network address of the connection to the master to filter the device database and run only those controllers that are allocated to the current node. Hostname resolution is supported. - -.. warning:: With some network setups, the current machine's hostname without the domain name resolves to a localhost address (127.0.0.1 or even 127.0.1.1). If you wish to use controllers across a network, make sure that the hostname you provide resolves to an IP address visible on the network (e.g. try providing the full hostname including the domain name). +It should not be confused with the 'master' device in a DRTIO system, which is simply a designation for the particular core device acting as central node in a distributed configuration of ARTIQ. The two concepts are otherwise unrelated. Command-line client -------------------- +^^^^^^^^^^^^^^^^^^^ The :ref:`command-line client ` connects to the master and permits modification and monitoring of the databases, monitoring the experiment schedule and log, and submitting experiments. Dashboard ---------- +^^^^^^^^^ -The :ref:`dashboard ` connects to the master and is the main way of interacting with it. The main features of the dashboard are scheduling of experiments, setting of their arguments, examining the schedule, displaying real-time results, and debugging TTL and DDS channels in real time. +The :ref:`dashboard ` connects to the master and is the main method of interacting with it. The main features of the dashboard are scheduling of experiments, setting of their arguments, examining the schedule, displaying real-time results, and debugging TTL and DDS channels in real time. + +Controller manager +^^^^^^^^^^^^^^^^^^ + +The controller manager is provided in the ``artiq-comtools`` package and started with the ``artiq_ctlmgr`` command. It is responsible for running and stopping controllers on a machine. One controller manager must be run by each network node that runs controllers. + +A controller manager connects to the master and accesses the device database through it to determine what controllrs need to be run. The local network address of the connection is used to filter for only those controllers allocated to the current node. Hostname resolution is supported. Changes to the device database are tracked and controllers will be stopped and started accordingly. + +.. warning:: + With some network setups, the current machine's hostname without the domain name resolves to a localhost address (127.0.0.1 or even 127.0.1.1). If you wish to use controllers across a network, make sure that the hostname you provide resolves to an IP address visible on the network (e.g. try providing the full hostname including the domain name). + +Git integration +--------------- + +The master may use a Git repository to store experiment source code. Using Git has many advantages. For example, each result file (HDF5) contains the commit ID corresponding to the exact source code it was produced by, which helps reproducibility. + +Although the master also supports non-bare repositories, it is recommended to use a bare repository (e.g. ``git init --bare``) to easily support push transactions from clients. + +You will want Git to notify the master every time the repository is pushed to (e.g. updated), so that the master knows to rescan the repository for new or changed experiments. This is easiest done with the ``post-receive`` hook, as described in :ref:`master-setting-up-git`. + +.. note:: + If you plan to run the ARTIQ system entirely on a single machine, you may also consider using a non-bare repository and the ``post-commit`` hook to trigger repository scans every time you commit changes (locally). In this case, note that the ARTIQ master never uses the repository's working directory, but only what is committed. More precisely, when scanning the repository, it fetches the last (atomically) completed commit at that time of repository scan and checks it out in a temporary folder. This commit ID is used by default when subsequently submitting experiments. There is one temporary folder by commit ID currently referenced in the system, so concurrently running experiments from different repository revisions is fully supported by the master. + +The dashboard always runs experiments from the repository. By default the command-line client (``artiq_client submit``) runs experiments from the raw filesystem (which is useful for iterating rapidly without creating many disorganized commits). If you want to use the repository instead, simply pass the ``-R`` option. + +.. _experiment-scheduling: Experiment scheduling -********************* +--------------------- Basics ------- +^^^^^^ -To use hardware resources more efficiently, potentially compute-intensive pre-computation and analysis phases of other experiments are executed in parallel with the body of the current experiment that accesses the hardware. +To make more efficient use of hardware resources, experiments are generally split into three phases and pipelined, such that potentially compute-intensive pre-computation or analysis phases may be executed in parallel with the bodies of other experiments, which access hardware. -.. seealso:: These steps are implemented in :class:`~artiq.language.environment.Experiment`. However, user-written experiments should usually derive from (sub-class) :class:`artiq.language.environment.EnvExperiment`. +.. seealso:: + These steps are implemented in :class:`~artiq.language.environment.Experiment`. However, user-written experiments should usually derive from (sub-class) :class:`artiq.language.environment.EnvExperiment`. -Experiments are divided into three phases that are programmed by the user: +There are three stages of a standard experiment users may write code in: -1. The **preparation** stage, that pre-fetches and pre-computes any data that necessary to run the experiment. Users may implement this stage by overloading the :meth:`~artiq.language.environment.Experiment.prepare` method. It is not permitted to access hardware in this stage, as doing so may conflict with other experiments using the same devices. -2. The **running** stage, that corresponds to the body of the experiment, and typically accesses hardware. Users must implement this stage and overload the :meth:`~artiq.language.environment.Experiment.run` method. +1. The **preparation** stage, which pre-fetches and pre-computes any data that necessary to run the experiment. Users may implement this stage by overloading the :meth:`~artiq.language.environment.Experiment.prepare` method. It is not permitted to access hardware in this stage, as doing so may conflict with other experiments using the same devices. +2. The **run** stage, which corresponds to the body of the experiment and generally accesses hardware. Users must implement this stage and overload the :meth:`~artiq.language.environment.Experiment.run` method. 3. The **analysis** stage, where raw results collected in the running stage are post-processed and may lead to updates of the parameter database. This stage may be implemented by overloading the :meth:`~artiq.language.environment.Experiment.analyze` method. -.. note:: Only the :meth:`~artiq.language.environment.Experiment.run` method implementation is mandatory; if the experiment does not fit into the pipelined scheduling model, it can leave one or both of the other methods empty (which is the default). +Only the :meth:`~artiq.language.environment.Experiment.run` method implementation is mandatory; if the experiment does not fit into the pipelined scheduling model, it can leave one or both of the other methods empty (which is the default). -The three phases of several experiments are then executed in a pipelined manner by the scheduler in the ARTIQ master: experiment A executes its preparation stage, then experiment A executes its running stage while experiment B executes its preparation stage, and so on. +Consecutive experiments are then executed in a pipelined manner by the ARTIQ master's scheduler: first experiment A runs its preparation stage, than experiment A executes its running stage while experiment B executes its preparation stage, and so on. .. note:: - The next experiment (B) may start :meth:`~artiq.language.environment.Experiment.run`\ ing before all events placed into (core device) RTIO buffers by the previous experiment (A) have been executed. These events can then execute while experiment B is :meth:`~artiq.language.environment.Experiment.run`\ ing. Using :meth:`~artiq.coredevice.core.Core.reset` clears the RTIO buffers, discarding pending events, including those left over from A. + The next experiment (B) may start its :meth:`~artiq.language.environment.Experiment.run` before all events placed into (core device) RTIO buffers by the previous experiment (A) have been executed. These events may then execute while experiment B's :meth:`~artiq.language.environment.Experiment.run` is already in progress. Using :meth:`~artiq.coredevice.core.Core.reset` in experiment B will clear the RTIO buffers, discarding pending events, including those left over from A. Interactions between events of different experiments can be avoided by preventing the :meth:`~artiq.language.environment.Experiment.run` method of experiment A from returning until all events have been executed. This is discussed in the section on RTIO :ref:`rtio-handover-synchronization`. Priorities and timed runs -------------------------- +^^^^^^^^^^^^^^^^^^^^^^^^^ -When determining what experiment to begin executing next (i.e. entering the preparation stage), the scheduling looks at the following factors, by decreasing order of precedence: +When determining what experiment should begin executing next (i.e. enter the preparation stage), the scheduling looks at the following factors, by decreasing order of precedence: -1. Experiments may be scheduled with a due date. If there is one and it is not reached yet, the experiment is not eligible for preparation. +1. Experiments may be scheduled with a due date. This is considered the *earliest possible* time of their execution (rather than a deadline, or latest possible -- ARTIQ makes no guarantees about experiments being started or completed before any specified time). If a due date is set and it has not yet been reached, the experiment is not eligible for preparation. 2. The integer priority value specified by the user. -3. The due date itself. The earlier the due date, the earlier the experiment is scheduled. -4. The run identifier (RID), an integer that is incremented at each experiment submission. This ensures that, all other things being equal, experiments are scheduled in the same order as they are submitted. +3. The due date itself. The earliest (reached) due date will be scheduled first. +4. The run identifier (RID), an integer that is incremented at each experiment submission. This ensures that, all else being equal, experiments are scheduled in the same order as they are submitted. + +Multiple pipelines +^^^^^^^^^^^^^^^^^^ + +Experiments must be placed into a pipeline at submission time, set by the "Pipeline" field. Multiple pipelines can operate in parallel within the same master. Pipelines are identified by their names, and are automtically created (when an experiment is scheduled with a pipeline name that does not yet exist) and destroyed (when they run empty). By default, all experiments are submitted into the same pipeline, ``main``. + +When using multiple pipelines it is the responsibility of the user to ensure that experiments scheduled in parallel will never conflict with those of another pipeline over resources (e.g. attempt to use the same devices simultaneously). Pauses ------- +^^^^^^ In the run stage, an experiment may yield to the scheduler by calling the ``pause()`` method of the scheduler. -If there are other experiments with higher priority (e.g. a high-priority timed experiment has reached its due date), they are preemptively executed, and then ``pause()`` returns. -Otherwise, ``pause()`` returns immediately. -To check whether ``pause()`` would in fact *not* return immediately, use :meth:`artiq.master.scheduler.Scheduler.check_pause`. +If there are other experiments with higher priority (e.g. a high-priority experiment has been newly submitted, or reached its due date and become eligible for execution), the higher-priority experiments are executed first, and then ``pause()`` returns. If there are no such experiments, ``pause()`` returns immediately. To check whether ``pause()`` would in fact *not* return immediately, use :meth:`artiq.master.scheduler.Scheduler.check_pause`. The experiment must place the hardware in a safe state and disconnect from the core device (typically, by calling ``self.core.comm.close()`` from the kernel, which is equivalent to :meth:`artiq.coredevice.core.Core.close`) before calling ``pause()``. Accessing the ``pause()`` and :meth:`~artiq.master.scheduler.Scheduler.check_pause` methods is done through a virtual device called ``scheduler`` that is accessible to all experiments. The scheduler virtual device is requested like regular devices using :meth:`~artiq.language.environment.HasEnvironment.get_device` (``self.get_device()``) or :meth:`~artiq.language.environment.HasEnvironment.setattr_device` (``self.setattr_device()``). -:meth:`~artiq.master.scheduler.Scheduler.check_pause` can be called (via RPC) from a kernel, but ``pause()`` must not. - -Multiple pipelines ------------------- - -Multiple pipelines can operate in parallel inside the same master. It is the responsibility of the user to ensure that experiments scheduled in one pipeline will never conflict with those of another pipeline over resources (e.g. same devices). - -Pipelines are identified by their name, and are automatically created (when an experiment is scheduled with a pipeline name that does not exist) and destroyed (when they run empty). - - -Git integration -*************** - -The master may use a Git repository for the storage of experiment source code. Using Git has many advantages. For example, each result file (HDF5) contains the commit ID corresponding to the exact source code that produced it, which helps reproducibility. - -Even though the master also supports non-bare repositories, it is recommended to use a bare repository so that it can easily support push transactions from clients. Create it with e.g.: :: - - $ mkdir experiments - $ cd experiments - $ git init --bare - -You want Git to notify the master every time the repository is pushed to (updated), so that it is rescanned for experiments and e.g. the GUI controls and the experiment list are updated. - -Create a file named ``post-receive`` in the ``hooks`` folder (this folder has been created by the ``git`` command), containing the following: :: - - #!/bin/sh - artiq_client scan-repository - -Then set the execution permission on it: :: - - $ chmod 755 hooks/post-receive - -You may now run the master with the Git support enabled: :: - - $ artiq_master -g -r /path_to/experiments - -Push commits containing experiments to the bare repository using e.g. Git over SSH, and the new experiments should automatically appear in the dashboard. - -.. note:: If you plan to run the ARTIQ system entirely on a single machine, you may also consider using a non-bare repository and the ``post-commit`` hook to trigger repository scans every time you commit changes (locally). The ARTIQ master never uses the repository's working directory, but only what is committed. More precisely, when scanning the repository, it fetches the last (atomically) completed commit at that time of repository scan and checks it out in a temporary folder. This commit ID is used by default when subsequently submitting experiments. There is one temporary folder by commit ID currently referenced in the system, so concurrently running experiments from different repository revisions is fully supported by the master. - -The dashboard always runs experiments from the repository. The command-line client, by default, runs experiment from the raw filesystem (which is useful for iterating rapidly without creating many disorganized commits). If you want to use the repository instead, simply pass the ``-R`` option. +:meth:`~artiq.master.scheduler.Scheduler.check_pause` can be called (via RPC) from a kernel, but ``pause()`` must not be. Scheduler API reference -*********************** +----------------------- The scheduler is exposed to the experiments via a virtual device called ``scheduler``. It can be requested like any regular device, and then the methods below can be called on the returned object. -The scheduler virtual device also contains the attributes ``rid``, ``pipeline_name``, ``priority`` and ``expid`` that contain the corresponding information about the current run. +The scheduler virtual device also contains the attributes ``rid``, ``pipeline_name``, ``priority`` and ``expid``, which contain the corresponding information about the current run. .. autoclass:: artiq.master.scheduler.Scheduler :members: Client control broadcasts (CCBs) -******************************** +-------------------------------- Client control broadcasts are requests made by experiments for clients to perform some action. Experiments do so by requesting the ``ccb`` virtual device and calling its ``issue`` method. The first argument of the issue method is the name of the broadcast, and any further positional and keyword arguments are passed to the broadcast. @@ -143,8 +125,10 @@ CCBs are used by experiments to configure applets in the dashboard, for example .. autoclass:: artiq.dashboard.applets_ccb.AppletsCCBDock :members: +.. _applet-references: + Applet request interfaces -************************* +------------------------- Applet request interfaces allow applets to perform actions on the master database and set arguments in the dashboard. Applets may inherit from the ``artiq.applets.simple.SimpleApplet`` and call the methods defined below through the `req` attribute. @@ -155,7 +139,7 @@ Embedded applets should use `AppletRequestIPC` while standalone applets use `App Applet entry area -***************** +----------------- Argument widgets can be used in applets through the `EntryArea` class. @@ -215,23 +199,21 @@ The `EntryArea` object can then be added to a layout and integrated with the app Front-end tool reference -************************ - +------------------------ .. _frontend-artiq-master: artiq_master ------------- +^^^^^^^^^^^^ .. argparse:: :ref: artiq.frontend.artiq_master.get_argparser :prog: artiq_master - - + .. _frontend-artiq-client: artiq_client ------------- +^^^^^^^^^^^^ .. argparse:: :ref: artiq.frontend.artiq_client.get_argparser @@ -241,7 +223,7 @@ artiq_client .. _frontend-artiq-dashboard: artiq_dashboard ---------------- +^^^^^^^^^^^^^^^ .. argparse:: :ref: artiq.frontend.artiq_dashboard.get_argparser @@ -249,7 +231,7 @@ artiq_dashboard artiq_session -------------- +^^^^^^^^^^^^^ .. argparse:: :ref: artiq.frontend.artiq_session.get_argparser