1
0
forked from M-Labs/artiq

doc: Elaboration on Git workflows

This commit is contained in:
architeuthidae 2024-10-22 19:57:25 +02:00 committed by Sébastien Bourdeauducq
parent 02235b2d80
commit 4bf2331d6a
2 changed files with 55 additions and 17 deletions

View File

@ -171,7 +171,6 @@ Now, in the same dock as 'Explorer', navigate to the tab 'Interactive Args'. You
In order to request and supply multiple interactive arguments at once, simply place them in the same ``with`` block; see also the example ``interactive.py`` in ``examples/no_hardware``. In order to request and supply multiple interactive arguments at once, simply place them in the same ``with`` block; see also the example ``interactive.py`` in ``examples/no_hardware``.
.. _master-setting-up-git: .. _master-setting-up-git:
Setting up Git integration Setting up Git integration
@ -180,7 +179,7 @@ Setting up Git integration
So far, we have used the bare filesystem for the experiment repository, without any version control. Using Git to host the experiment repository helps with tracking modifications to experiments and with the traceability to a particular version of an experiment. So far, we have used the bare filesystem for the experiment repository, without any version control. Using Git to host the experiment repository helps with tracking modifications to experiments and with the traceability to a particular version of an experiment.
.. note:: .. note::
The workflow we will describe in this tutorial corresponds to a situation where the computer running the ARTIQ master is also used as a Git server to which multiple users may contribute code. The Git setup can be customized according to your needs; the main point to remember is that when scanning or submitting, the ARTIQ master uses the internal Git data (*not* any working directory that may be present) to fetch the latest *fully completed commit* at the repository's head. See the :ref:`Management system <mgmt-git-integration>` page for notes on alternate workflows. The workflow we will describe in this tutorial is designed for a situation where the computer running the ARTIQ master is also used as a Git server to which multiple users may contribute code. This is not the only way Git integration can be useful, and the setup can be customized according to your needs. The main point to remember is that when scanning or submitting, the ARTIQ master uses the internal Git data, *not* any checked-out files that may be present, to fetch the latest *fully completed commit* at the repository's head. See also :ref:`mgmt-git-integration`, especially if you are unfamiliar with Git.
We will use our current ``repository`` folder as the working directory for making local modifications to the experiments, move it away from the master's data directory, and replace it with a new ``repository`` folder, which will hold only the Git data used by the master. Stop the master with Ctrl+C and enter the following commands: :: We will use our current ``repository`` folder as the working directory for making local modifications to the experiments, move it away from the master's data directory, and replace it with a new ``repository`` folder, which will hold only the Git data used by the master. Stop the master with Ctrl+C and enter the following commands: ::
@ -207,7 +206,7 @@ and finally, connect the two repositories and push the commit upstream to the ma
$ git push -u origin master $ git push -u origin master
.. tip:: .. tip::
If you are not familiar with command-line Git and would like to understand these commands in more detail, search for some tutorials in basic use of Git; there are many available online. If you are not familiar with command-line Git and would like to understand these commands in more detail, look for tutorials on the basic use of Git; there are many available online.
Start the master again with the ``-g`` flag, which tells it to treat its ``repository`` folder as a bare Git repository: :: Start the master again with the ``-g`` flag, which tells it to treat its ``repository`` folder as a bare Git repository: ::
@ -233,10 +232,10 @@ Then set its execution permissions: ::
Let's now make a modification to the experiments. In the working directory ``artiq-work``, open ``mgmt_tutorial.py`` again and add an exclamation mark to the end of "Hello World". Before committing it, check that the experiment can still be executed correctly by submitting it directly from the working directory, using the command-line client: :: Let's now make a modification to the experiments. In the working directory ``artiq-work``, open ``mgmt_tutorial.py`` again and add an exclamation mark to the end of "Hello World". Before committing it, check that the experiment can still be executed correctly by submitting it directly from the working directory, using the command-line client: ::
$ artiq_client submit ~/artiq-work/mgmt_tutorial.py $ artiq_client submit --content ~/artiq-work/mgmt_tutorial.py
.. note:: .. note::
Alternatively, right-click in the Explorer dock and select the 'Open file outside repository' option for the same effect. The ``--content`` flag submits by content, that is, by sending a raw file rather than selecting an experiment from the master's local environment. Since you are likely running the client and the master on the same machine, it is probably not strictly necessary here. In a distributed setup across multiple machines, the master will not have access to the client's filesystem, and the ``--content`` flag is the only way to run experiments directly from a client file. See also :ref:`submission-details`.
Verify the log in the GUI. If you are happy with the result, commit the new version and push it into the master's repository: :: Verify the log in the GUI. If you are happy with the result, commit the new version and push it into the master's repository: ::
@ -257,4 +256,4 @@ Arguments to the individual tools (including ``-s`` and ``--bind``) can still be
$ artiq_session -m=-g $ artiq_session -m=-g
to start the session with the master in Git mode. See also :mod:`~artiq.frontend.artiq_session`. to start the session with Git integration. See also :mod:`~artiq.frontend.artiq_session`.

View File

@ -7,19 +7,23 @@ Management system
Components Components
---------- ----------
See also :doc:`overview` for a visual idea of the management system.
Master Master
^^^^^^ ^^^^^^
The :ref:`ARTIQ master <frontend-artiq-master>` is responsible for managing the dataset and device databases, the experiment repository, scheduling and running experiments, archiving results, and distributing real-time results. It is a headless component, and one or several clients (command-line or GUI) use the network to interact with it. The :ref:`ARTIQ master <frontend-artiq-master>` is responsible for managing the dataset and device databases, the experiment repository, scheduling and running experiments, archiving results, and distributing real-time results. It is a headless component, and one or several clients (command-line or GUI) use the network to interact with it.
The master expects to be given a directory on startup, the experiment repository, containing these experiments which are automatically tracked and communicated to clients. By default, it simply looks for a directory called ``repository``. The ``-r`` flag can be used to substitute an alternate location. The master expects to be given a directory on startup, the experiment repository, containing these experiments which are automatically tracked and communicated to clients. By default, it simply looks for a directory called ``repository``. The ``-r`` flag can be used to substitute an alternate location. Subdirectories in ``repository`` are also read, and experiments stored in them are known to the master. They will be displayed as folders in the dashboard's explorer.
It also expects access to a ``device_db.py``, with a corresponding flag ``--device-db`` to substitute a different file name. Additionally, it will reference or create certain files in the directory it is run in, among them ``dataset_db.mdb``, the LMDB database containing persistent datasets, ``last_rid.pyon``, which simply contains the last used RID, and the ``results`` directory. It also expects access to a ``device_db.py``, with a corresponding flag ``--device-db`` to substitute a different file name. Additionally, it will reference or create certain files in the directory it is run in, among them ``dataset_db.mdb``, the LMDB database containing persistent datasets, ``last_rid.pyon``, which simply holds the last used RID, and the ``results`` directory. For more on the device and dataset databases, see also :doc:`environment`.
.. note:: .. note::
Because the other parts of the management system all seem to be able to access the information stored in these files, confusion can sometimes result about where it is really stored and how it is distributed. Device databases, datasets, results, and experiments are all solely kept and administered by the master, which communicates information to dashboards, browsers, and clients over the network whenever necessary. Because the other parts of the management system often display knowledge of the information stored in these files, confusion can sometimes result about where it is really stored and how it is distributed. Device databases, datasets, results, and experiments are all solely kept and administered by the master, which communicates with dashboards, clients, and controller managers over the network whenever necessary.
Notably, clients and dashboards do not send in experiments to the master; they request them from the array of experiments the master knows about, primarily those in ``repository``, but also in the master's local file system, if 'Open file outside repository' is selected. This is true even if ``repository`` is configured as a Git repository and cloned on other machines. Notably, clients and dashboards normally do not *send in* experiments to the master. Rather, they make requests from the list of experiments the master already knows about, primarily those in ``repository``, but also in the master's local file system. This is true even if ``repository`` is configured as a Git repository and cloned onto other machines.
The only exception is the command line client's ``--content`` flag, which allows submission by content, i.e. sending in experiment files which may be otherwise unknown to the master. This feature however has some important limitations; see below in :ref:`submission-details`.
The ARTIQ master should not be confused with the 'master' device in a DRTIO system, which is only a designation for the particular core device acting as central node in a distributed configuration of ARTIQ. The two concepts are otherwise unrelated. The ARTIQ master should not be confused with the 'master' device in a DRTIO system, which is only a designation for the particular core device acting as central node in a distributed configuration of ARTIQ. The two concepts are otherwise unrelated.
@ -32,31 +36,66 @@ The :ref:`dashboard <frontend-artiq-dashboard>` connects to the master and is th
The dashboard remembers and restores GUI state (window/dock positions, last values entered by the user, etc.) in between instances. This information is stored in a file called ``artiq_dashboard_{server}_{port}.pyon`` in the configuration directory (e.g. generally ``~/.config/artiq`` for Unix, same as data directory for Windows), distinguished in subfolders by ARTIQ version. The dashboard remembers and restores GUI state (window/dock positions, last values entered by the user, etc.) in between instances. This information is stored in a file called ``artiq_dashboard_{server}_{port}.pyon`` in the configuration directory (e.g. generally ``~/.config/artiq`` for Unix, same as data directory for Windows), distinguished in subfolders by ARTIQ version.
.. note::
To find where the configuration files are stored on your machine, try the command: ::
python -c "from artiq.tools import get_user_config_dir; print(get_user_config_dir())"
Browser Browser
^^^^^^^ ^^^^^^^
The :ref:`browser <frontend-artiq-browser>` is used to read ARTIQ ``results`` HDF5 files and run experiment :meth:`~artiq.language.environment.Experiment.analyze` functions, in particular to retrieve previous result databases, process them, and display them in ARTIQ applets. The browser also remembers and restores its GUI state; this is stored in a file called simply ``artiq_browser``, kept in the same configuration directory as the dashboard. The :ref:`browser <frontend-artiq-browser>` is used to read ARTIQ ``results`` HDF5 files and run experiment :meth:`~artiq.language.environment.Experiment.analyze` functions, in particular to retrieve previous result databases, process them, and display them in ARTIQ applets. The browser also remembers and restores its GUI state; this is stored in a file called simply ``artiq_browser.pyon``, kept in the same configuration directory as the dashboard.
The browser *can* connect to the master, specifically in order to be able to access the master's store of datasets and to upload new datasets to it, but it does not require such a connection and can also be run completely standalone. However, it requires filesystem access to the ``results`` files to be of much use.
Controller manager Controller manager
^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
The controller manager is provided in the ``artiq-comtools`` package (which is also made available separately from mainline ARTIQ, to allow independent use with minimal dependencies) and started with the :mod:`~artiq_comtools.artiq_ctlmgr` command. It is responsible for running and stopping controllers on a machine. One controller manager must be run by each network node that runs controllers. The controller manager is provided in the ``artiq-comtools`` package (which is also made available separately from mainline ARTIQ, to allow independent use with minimal dependencies) and started with the :mod:`~artiq_comtools.artiq_ctlmgr` command. It is responsible for running and stopping controllers on a machine. One controller manager must be run by each network node that runs controllers.
A controller manager connects to the master and accesses the device database through it to determine what controllers need to be run. The local network address of the connection is used to filter for only those controllers allocated to the current node. Hostname resolution is supported. Changes to the device database are tracked and controllers will be stopped and started accordingly. A controller manager connects to the master and accesses the device database through it to determine what controllers need to be run. The local network address of the connection is used to filter for only those controllers allocated to the current node. Hostname resolution is supported. Changes to the device database are tracked upon rescan and controllers will be stopped and started accordingly.
.. _mgmt-git-integration: .. _mgmt-git-integration:
Git integration Git integration
--------------- ---------------
The master may use a Git repository to store experiment source code. Using Git has many advantages. For example, each result file (HDF5) contains the commit ID corresponding to the exact source code it was produced by, which helps reproducibility. Although the master also supports non-bare repositories, it is recommended to use a bare repository (e.g. ``git init --bare``) to easily support push transactions from clients. The master may use a Git repository to store experiment source code. Using Git rather than the bare filesystem has many advantages. For example, each HDF5 result file contains the commit ID corresponding to the exact source code it was produced by, making results more reproduceable. See also :ref:`master-setting-up-git`. Generally, it is recommended to use a bare repository (i.e. ``git init --bare``), to easily support push transactions from clients, but both bare and non-bare repositories are supported.
You will want Git to notify the master every time the repository is pushed to (e.g. updated), so that the master knows to rescan the repository for new or changed experiments. This is easiest done with the ``post-receive`` hook, as described in :ref:`master-setting-up-git`. .. tip::
If you are not familiar with Git, you may find the idea of the master reading experiment files from a bare repository confusing. A bare repository does not normally contain copies of the objects it stores; that is to say, you won't be able to find your experiment files listed in it. What it *does* contain is Git's internal data structures, i.e., ``hooks``, ``objects``, ``config``, and so forth. Among other things, this structure also stores, in compressed form, the full contents of every commit made to the repository. It is this compressed data which the master has access to and can read the experiments from. It is not meant to be directly edited, but it is updated every time new commits are received.
.. note:: It may be useful to note that a normal Git repository, created with ``git init``, contains all the same internal data, kept in a hidden directory called ``.git`` to protect it from accidental modifications. Unlike a bare repository, it *also* normally contains working copies of all the files tracked by Git. When working with a non-bare repository, it is important to understand that the master still takes its image of the available experiments from the internal data, and *not* from the working copies. This is why, even in a non-bare repository, changes are only reflected once they are committed. The working copies are simply ignored.
If you plan to run the ARTIQ system entirely on a single machine, you may also consider using a non-bare repository and the ``post-commit`` hook to trigger repository scans every time you commit changes (locally). In this case, note that the ARTIQ master never uses the repository's working directory, but only what is committed. More precisely, when scanning the repository, it fetches the last (atomically) completed commit at that time of repository scan and checks it out in a temporary folder. This commit ID is used by default when subsequently submitting experiments. There is one temporary folder by commit ID currently referenced in the system, so concurrently running experiments from different repository revisions is fully supported by the master.
By default, the dashboard runs experiments from the repository, whereas the command-line client (``artiq_client submit``) runs experiments from the raw filesystem (which is useful for iterating rapidly without creating many disorganized commits). In order to run from the raw filesystem when using the dashboard, right-click in the Explorer window and select the option 'Open file outside repository'. In order to run from the repository when using the command-line client, simply pass the ``-R`` flag. Other important files -- the device database, the dataset database, the ``results`` directory, and so on -- are normally kept outside of the experiment repository, and in this case, they are not stored or handled by Git at all. The master accesses them through the regular filesystem, not through Git, and other ARTIQ components access them through the master. This can be seen visualized in the :doc:`overview`.
With a bare repository, a Git ``post-receive`` hook can be used to trigger a repository scan every time the repository is pushed to (i.e. updated), as described in the tutorial. This removes the need to trigger repository rescans manually. If you plan to run your ARTIQ system from a single PC, without distributed clients, you may also consider using a non-bare repository and the ``post-commit`` hook instead. In this workflow, changes can be drafted directly in the master's repository, but the master continues to submit from the last completed commit until a new commit is made (and the repository is rescanned).
Behind the scenes, when scanning the repository, the master fetches the last (atomically) completed commit at that time of repository scan and checks it out in a temporary folder. This commit ID is used by default when subsequently submitting experiments. There is one temporary folder by commit ID currently referenced in the system, so concurrently running experiments from different repository revisions is fully supported by the master.
The use of the Git backend is triggered when the master is started with the ``-g`` flag. Otherwise the raw filesystem is read and Git-based features will not be available.
.. _submission-details:
Submission from the raw filesystem
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
By default, the dashboard runs experiments from the repository, that is, the master's temporary checkout folder, whereas the command-line client (``artiq_client submit``) runs experiments from the raw filesystem. This is convenient in order to be able to run working drafts without first committing them.
Be careful with this behavior, however, as it is rather particular. *The raw filesystem* means the immediate local filesystem of the running master. If the client is being run remotely, and you want to submit an experiment from the *client's* local filesystem, e.g. an uncommitted draft in a clone of the experiment repository, use the ``--content`` flag. If you would like to submit an experiment from the repository, in the same way the dashboard does, use the flag ``--repository`` / ``-R``.
To be precise:
- ``artiq_client submit`` should be given a file path that is relative to the location of the master, that is, if the master is run in the directory above its ``repository``, an experiment can be submitted as ``repository/experiment_file.py``. Keep in mind that when working with a bare repository, there may be no copies of experiment files in the raw local filesystem. In this case, files can still be made accessible to the master by network filesystem share or some other method for testing.
- ``artiq_client submit --repository`` should be given a file path relative to the root of the repository, that is, if the experiment is directly within ``repository``, it should be submitted as ``experiment_file.py``. Just as in the dashboard, this file is taken from the last completed commit.
- ``artiq_client submit --content`` should be given a file path that is relative to the location of the client, whether that is local or remote to the master; the contents of the file will be submitted directly to the master to be run. This essentially transfers a raw string, and will not work if the experiment imports or otherwise accesses other files.
Other flags can also be used, such as ``--class-name`` / ``-c`` to select a class name in an experiment which contains several, or ``--revision`` / ``-r`` to use a particular revision. See the reference of :mod:`~artiq.frontend.artiq_client` in :doc:`main_frontend_tools`.
In order to run from the raw filesystem when using the dashboard, right-click in the Explorer window and select the option 'Open file outside repository'. This will open a file explorer window displaying the master's local filesystem, which can be used to select and submit experiments outside of the chosen repository directory. There is no GUI support for submission by content. It is recommended to simply use the command-line client for this purpose.
.. _experiment-scheduling: .. _experiment-scheduling: