Log your observations on the cause and add comments to a Jira ticket (either existing or one you create). Include any unique activities occurring when the fault happened, (c.f. OBS-416) as MTM2 is not expected to fault.
This step must not be skipped.
Restarting the control system acts as a catch-all for resetting issues. Failing to do so may also cause issues with telemetry.
Connect to the admin user on M2 cRIO controller via ssh using the username and password found in 1password MainTel vault.
Note
There are 2 cRIO controllers in the summit:
m2-crio-controller01.cp.lsst.org
m2-crio-controller02.cp.lsst.org
Depending on the location of M2, run the command:
If M2 is at the TMA:
sshadmin@m2-crio-controller01.cp.lsst.org
If M2 is on level 3:
sshadmin@m2-crio-controller02.cp.lsst.org
Stop the control system and wait 3 minutes using the command:
/etc/init.d/nilvrtstop
Start the control system and wait 3 minutes using the command:
/etc/init.d/nilvrtstart
You may press enter to regain your shell prompt when you see the following “Welcome to LabVIEW Real-Time 18.0”.
Note that all status boxes for the M2 actuator will appear green. This indicates the status of the relay that enables power to the systems, not the status of M2 itself. Therefore, after an interlock or power cycling, it is necessary to press the RESET button.
Use python EUI/GUI to change MTM2 to closed-loop state:
Open the MTM2 EUI. Follow instructions to access the MTM2 EUI.
Establish local control by pressing connect, then local.
Note that local may be greyed out after connecting, this is normal.
If Reset All Items does not work, you maybe have to power cycle M2 cabinet.
Only do this if there are no other options!
Switch to Diagnostic mode. Be patient; this may take some time.
Switch to Enabled mode. This may take up to 2 minutes. If this step fails, you may have to repeat Reset the M2 interlock signal instructions.
Enter closed-loop control.
Return to Standby mode in the EUI to close the GUI by pressing the following buttons:
Enter open-loop control.
Diagnostic mode, this usually takes ~30s.
Standby mode, this usually takes ~30s.
Remote mode, to allow CSC control of M2.
Disconnect EUI on the top tool bar, this usually takes ~30s.
Exit on the top tool bar.
Change the status of MTM2 CSC from DISABLED to ENABLED.
If the attempt fails, try again, but first set it to STANDBY. Each transition is expected to take approximately 2 minutes.
Check that M2 in under closed-loop control 4 in Chronograf M2 state.
If needed, set closed-loop control by running the script standardscripts/maintel/m2/enable_closed_loop.py, without configuration. This can be done even if you are already under closed-loop control.
If you are unable to find the fault,
check the cRIO controller log that contains detailed report faults.
These logs are found in the /u/log/ directory.
Use the command ls -lrt to list logs, with the most recently modified logs displayed at the bottom. Logs are named according to their creation date and time.
Grab error messages from the log with a command like grep -nr "error" name_of_log_here
ls-lrt# list times in directory, in a list, sorted by time, in reverse order (newest on bottom)
grep-nr"error"{logname}# List lines from file {logname} containing error
cat{logname}# print the log file to terminal, sometimes these are short and in the event of a fault, interesting lines are at the bottom