TMA Restart MTMount Operation Manager (OM)#

Overview#

This document outlines the procedure to restart the MTMount Operation Manager (OM). Restarting the OM may be necessary for troubleshooting or maintenance purposes. The OM is responsible for managing various operations related to the telescope mount.

Error diagnosis#

When trying to change the MTMount CSC from STANDBY to DISABLED state:

  • MTMount goes to FAULT.

  • It could not retrieve telemetry from the CCW.

  • Raises the following error message:

MTMount Error Message when transitioning from STANDBY to DISABLED#
RuntimeError: Telemetry client timed out waiting for telemetry; giving up.

Procedure Steps#

The procedure involves closing the EUI, assessing the status of the OM, stopping it, restarting it, and then reopening the EUI. It is not necessary to turn off OSS or PS.

  1. Close the EUI Application: Close the EUI application ( HOME -> Close Application). Allow some time for the application to close properly before proceeding to the next step. It may take a while.

../../../../_images/EUI-Home.png

Screenshot of Simonyi EUI Home interface.#

If EUI has become unresponsive, run top and monitor the %CPU usage for the tma_eui. If the CPU is actively processing, be patient and continue to wait. If it takes too long (e.g. 10 min or more), kill the process. To do this, follow these steps in the TMA computer’s terminal:

ps -aux | grep 'tma_eui'
kill #PID
  1. ssh to TMA Server: Connect to the TMA server computer using SSH. Retrieve the credentials from the MainTel 1Password TMA Server vault.

[lsst@lsst ~]$ ssh lsst@tma-controller01.cp.lsst.org
  1. Check MTMount Operation Manager Status: Use the following command to check the status of the MTMount Operation Manager:

[lsst@lsst ~]$ sudo systemctl status mtmount-operation-manager.service

Enter the sudo password when prompted (same previously entered). The output should resemble this:

mtmount-operation-manager.service - LLST MTMount Operation Manager service
  Loaded: loaded (/usr/lib/systemd/system/mtmount-operation-manager.service; enabled; vendor preset: disabled)
  Active: active (running) since Fri 2024-03-15 08:30:17 UTC; 5 days ago
Main PID: 1036 (mtmount-operati)
   Tasks: 20
  Memory: 65.8M
  CGroup: /system.slice/mtmount-operation-manager.service
          └─1036 /usr/bin/mtmount-operation-manager

Mar 15 08:30:17 lsst systemd[1]: Started LLST MTMount Operation Manager service.
Hint: Some lines were ellipsized, use -l to show in full.

Note that in this case, the status is “active (running) since Fri 2024-03-15 08:30:17 UTC; 5 days ago”.

  1. Stop MTMount Operation Manager: To stop the service, replace status with stop in the previous command:

[lsst@lsst ~]$ sudo systemctl stop mtmount-operation-manager.service
  1. Start MTMount Operation Manager: Wait for one to two minutes before starting the operation manager again:

[lsst@lsst ~]$ sudo systemctl start mtmount-operation-manager.service

Wait for up to 5 minutes before proceeding to the next step and check that the OM status is:

Active: active (running) since …

  1. Open the EUI Application: Open the EUI application to resume operations. If you don’t know how to open it, refer to the “Opening TMA EUI” section in this document.

Post-Condition#

Upon completion of the procedure outlined above, the following post-conditions are expected:

  1. The MTMount Operation Manager service is running again.

  2. The TMA EUI application is operational and MTMount can be enabled without errors.

Ensuring these post-conditions confirms the successful completion of the OM restart procedure and guarantees the continuity of telescope operations.

Contingency#

If the procedure was not successful, report the issue in #summit-simonyi and/or activate the Out of hours support.