Observing Environment Package Management¶
Introduction¶
This page explains how package management is controlled on the Observing Environment, specifically the summit. The Observing Environment is mainly composed of the ScriptQueue and user’s Nublado instance, which are the primary high-level tools responsible for driving observatory operations.
Even though the Observing Environment is initially built and deployed alongside the rest of the Observatory Control System components (e.g. CSCs), it needs to follow a slightly different management approach to support on-the-fly updates. These updates will primary consist of bug fixes and or workarounds to unforeseen circumstances. Nominally, the CSC versions and supporting software packages deployed on the summit are managed by the cycle build Nevertheless, the procedure to deploy hot-fixes for CSCs (e.g. create an alpha tag on the package, update the cycle build and redeploy), is not suitable for the Observing Environment, which requires a much larger set of packages and therefore, longer build times. These patches to the Observing Environment needs to be rapidly rolled out to the summit and must be simultaneously available to the ScriptQueue CSC as well as the observer’s notebook environments. Most importantly, when testing new software on the summit, which may not be entirely stable, it is critical to have a mechanism to immediately roll back all packages to a designated stable version. We call this suite of stable versions the base observing environment.
The obs_user
account¶
The obs_user
account is a special account which does not have a login.
Any commands issued by that user must be done via sudo, and that user must have the proper privileges to execute that command.
obs_user
is the owner of the summit environment packages that are used by observers and the scriptQueue.
The number of actual commands performed by obs_user
is very limited and dependent upon the end-user.
There will be a few sanctioned individuals who maintain the observing environment(s), they will have extensive sudo privileges such that they can manage the packages accordingly, however, they should really only need to perform a few commands.
Observers will also have access to the obs_user
account, but only to execute a script that will roll back to the base environment.
All commands using the obs_user
account must be executed as follows, the command itself is only an example:
sudo -u obs_user git checkout main
The reason for using this method is because all sudo commands are logged, and therefore we have a history of who ran the command, which is stored in /var/log/secure
for CentOS7 machines.
The fact that the obs_user
, does not have a login capability ensures that all commands executed under the obs_user
identity are logged.
The command that observers use to roll back to the base environment is:
sudo -u obs_user FIXME
Also, observers may need to overwrite their current Nublado instance and restore it to using only the base environment. Note that this is a destructive action that takes time to revert.
sudo -u obs_user FIXME
This command will move your current ~/notebooks/.user_setups
file to a new file with a timestamp (e.g. ~/notebooks/.user_setups.<date>.bkp
, then create a new file which points to the base environment.
Managing the base environment¶
The base environment is defined by a list of packages and associated tags, or commit hashes, representing the packages which are deemed to be stable (to the best of everyone’s collective knowledge).
Note that the base environment needs to be maintained daily, and can only be updated by the production environment maintainers.
The list itself is stored in /opt/obs_user/base_env_repo/base_environment.yaml
(TBR),
The default for each package should be a tagged version that corresponds to the current cycle build.
However, in certain cases it may contain a specific commit or tag of a certain package that employs a bug fix that was identified the previous night but not yet incorporated into the main branch and pulled back into the cycle build.
Another possibility is that the main branch will have diverged from what is currently deployed and the changes will be brought into the base environment when a cycle upgrade is completed.
The base environment is a github repository managed by sanctioned individuals and led by the software architect. It is their responsibility to ensure that the summit environment stays stable and that appropriate hot fixes make it into the package’s repository. The are also responsible for keeping the run environment (described below) identical to the base environment to the maximum extent possible. The base environment repository also contains the scripts used to modify the environment that are executed by observers and/or maintainers.
Modifying the run environment¶
For all packages that are subject to potentially rapid changes during nighttime operations, a run environment is used to manage which commits are currently in use by the observing environment. In standard operations, where the situation is stable, the run environment is identical to the base environment. However, during commissioning and early operations, we are expecting the run environment to be more dynamic.
Like the base environment, the run environment is defined by a list of packages and associated tags, or commit hashes, stored in /opt/obs_user/run_environment.yaml
.
However, unlike the base environment configuration, this file only contains the packages that are to be overridden from the base environment; analogous to how CSC configurations are managed.
This file is not required to be managed by a GitHub repository, as it can be edited by a user, and therefore must be writeable by project personnel.
Note
In the future, it is suspected that we’ll write scripts to edit this file. Doing this ensures traceability regarding how the file was edited and by whom.
The script is executed using a sudo command such as:
sudo -u obs_user bash setup_run_environment
Not only does this script setup the environment, but it also writes a (read-only) log file to /var/logs/obs_user/run_env_<datetime>.log
listing each package and tag used in the setup. This includes both overrides and the base environment packages.
Optionally, the script can setup the environment using a previously written log file.
sudo -u obs_user bash setup_run_environment /var/logs/obs_user/run_env_<datetime>.log
Again, because all commands are run via sudo using the obs_user
id, the retrospection capability is preserved.
Also, due to the NFS mounted environment, the ScriptQueue gets the changes instantaneously, and observers need only to restart their notebook kernels.
Using the run environment¶
During the day, it is expected that developers and other personnel will modify the run environment to perform tests.
It is quite possible that people will share the environment, especially if the scriptQueue is required. If running notebooks, then users should change their environment from within their local Nublado instance.
At the beginning of the night, observers should run the script that sets up the base environment. In the special case where a previous run environment needs to be loaded, this should be communicated to the observers by the run manager.
On-sky testing then rolling back a CSC¶
Note
This section is here temporarily. It is more a use-case on how we should handle CSCs that need to be tested and then rolled back, not about package management.
In the event that a new CSC is rolled out for on-sky testing, but is not considered to be stable, this is to be performed by … manually deploying a detached head inside the container? Then the container just has to be sent to offline and re-synced to pull the sanctioned version?
Prerequisites¶
You must have sudo privileges to run the appropriate scripts.
Post-Condition¶
ScriptQueue and the Nublado instances will have access to the updated packages. However, Nublado users must restart their kernel to grab the changes.
The ScriptQueue instantiates the script from disk each time it is launched, and therefore nothing needs to be performed to grab the new changes.
Updating the “base” environment¶
If the changes should be included in base environment there are two options:
Updated the cycle build, and create a new tag. Then change the base-environment definition file
Procedure Steps¶
Troubleshooting¶
No troubleshooting information is applicable to this procedure.
Contact Personnel¶
This procedure was last modified Oct 25, 2022.
This procedure was written by Patrick Ingraham. The following are contributors: Tiago Ribeiro.