M2 Recovery#
Overview#
This procedure should be done when MTM2 goes to FAULT
and exits the closed-loop state.
This step-by-step guide is the integration of the Recovery system at night and Restart control system procedures
Information
MTM2 possible states or control modes are Idle, TelemetryOnly (Disabled), Open Loop, Closed Loop. Each state is indicated by an integer from 1 to 4.
You can check the MTM2 state in Chronograf dashboards M2 state, MTM2 dashboard or LOVE view, once you are signed in.
A GIS interlock prevents TMA movement if MTM2 is not in the closed-loop state.
Important
MTM2 should always be in the closed-loop state (state 4) for the safety of the glass.
Note
Any fault in MTM2 is worth ticketing, this is not a system that should reach a FAULT
state easily.
Error diagnosis#
MTM2 CSC goes to FAULT
and exits the closed-loop control (check M2 state).
Follow these instructions to investigate the state of MTM2 and identify the cause in the Chronograf MTM2 dashboard.
Check the following items:
- Power Current:
This will indicate when MTM2 exited the closed-loop state (as shown in the image below, the magenta line sharply drops at approximately 19:38).
- TMA Elevation Position and Elevation Angle measured by M2:
To ensure that they are at the same position after exiting the closed-loop state (note time displacement in plots).
- Tangent Fault:
It may indicate excessive forces were the cause of the fault.
- Log Message:
It may display useful information about the cause.
Note
Log your observations on the cause and add comments to a Jira ticket (either existing or one you create). Include any unique activities occurring when the fault happened, (c.f. OBS-416) as MTM2 is not expected to fault.
Procedure Steps#
- Restart control system.
Warning
This step must not be skipped. Restarting the control system acts as a catch-all for resetting issues. Failing to do so may also cause issues with telemetry.
- Connect to the admin user on M2 cRIO controller via ssh using the username and password found in 1password MainTel vault.
Note
There are 2 cRIO controllers in the summit:
m2-crio-controller01.cp.lsst.org
m2-crio-controller02.cp.lsst.org
Depending on the location of M2, run the command:
If M2 is at the TMA:
ssh admin@m2-crio-controller01.cp.lsst.org
If M2 is on level 3:
ssh admin@m2-crio-controller02.cp.lsst.org
- Stop the control system and wait 3 minutes using the command:
/etc/init.d/nilvrt stop
- Start the control system and wait 3 minutes using the command:
/etc/init.d/nilvrt start
You may press enter to regain your shell prompt when you see the following “Welcome to LabVIEW Real-Time 18.0”.
- Reset the M2 interlock signal in GIS main cabinet on level 2, even if the state is “OK”.
Important
Note that all status boxes for the M2 actuator will appear green. This indicates the status of the relay that enables power to the systems, not the status of M2 itself. Therefore, after an interlock or power cycling, it is necessary to press the RESET button.
- Use python EUI/GUI to change MTM2 to closed-loop state:
Open the MTM2 EUI. Follow instructions to access the MTM2 EUI.
- Establish local control by pressing connect, then local.
Note that local may be greyed out after connecting, this is normal.
- Pull up the Overview widget by double-clicking on Overview in the list at the bottom of the EUI.
-
- Check the Enabled Faults Mask.
It should not be 0. If it is, repeat Reset the M2 interlock signal.
Note
It is ok if the
isInterlockEngaged
indicator is red.
- Look at Alarms/Warnings widget to see active alarms (red) or warnings (yellow).
If active, reset them with Reset All Items.
Make sure you have removed the fault condition.
If Reset All Items does not work, you maybe have to power cycle M2 cabinet. Only do this if there are no other options!
Switch to Diagnostic mode. Be patient; this may take some time.
Switch to Enabled mode. This may take up to 2 minutes. If this step fails, you may have to repeat Reset the M2 interlock signal instructions.
Enter closed-loop control.
- Return to Standby mode in the EUI to close the GUI by pressing the following buttons:
Enter open-loop control.
Diagnostic mode, this usually takes ~30s.
Standby mode, this usually takes ~30s.
Remote mode, to allow CSC control of M2.
Disconnect EUI on the top tool bar, this usually takes ~30s.
Exit on the top tool bar.
- Change the status of MTM2 CSC from
DISABLED
toENABLED
. If the attempt fails, try again, but first set it to
STANDBY
. Each transition is expected to take approximately 2 minutes.
- Change the status of MTM2 CSC from
- Check that M2 in under closed-loop control 4 in Chronograf M2 state.
If needed, set closed-loop control by running the script
standardscripts/maintel/m2/enable_closed_loop.py
, without configuration. This can be done even if you are already under closed-loop control.
Post-Condition#
MTM2 is in
ENABLED
state.MTM2 is in closed-loop state (4).
Contingency#
If you are unable to find the fault, check the cRIO controller log that contains detailed report faults. These logs are found in the /u/log/ directory.
Use the command ls -lrt to list logs, with the most recently modified logs displayed at the bottom. Logs are named according to their creation date and time.
Grab error messages from the log with a command like grep -nr "error" name_of_log_here
Get information from log
ls -lrt # list times in directory, in a list, sorted by time, in reverse order (newest on bottom)
grep -nr "error" {logname} # List lines from file {logname} containing error
cat {logname} # print the log file to terminal, sometimes these are short and in the event of a fault, interesting lines are at the bottom
If the procedure was not successful, report the issue in #summit-simonyi and/or activate the Out of hours support.