Fault Reporting

Reporting telescope and observatory faults - whether they are mechanical errors, software bugs, or facilities issues - is a crucial aspect of observatory operations. Understanding the observatory and its efficiency begins with robust fault reporting, documenting recovery, and knowledge-sharing. This section describes the process to file a fault report for any incident that happens during nighttime operations in the Observing Operations (OBS) JIRA project.

Guidelines For Productive Reporting

The most important part of fault-reporting is that the team can understand the problem well. Some guidelines to keep in mind are:

  • Facts first. The author of the fault report should provide as many details as possible, including screenshots, telescope telemetry, and timestamps for future investigation.

  • If the reporter is unsure of who to assign the ticket to, leave it unassigned and alert the day time staff for further triage.

  • Ideas are welcome, but let the facts speak first.

  • Report a problem, not a person. Identifying the problem and reporting it effectively ensures that the Rubin team will move forward with a solution. Identifying a person as being “at-fault” for a problem reported in the night is not productive. Learn and grow, not blame and shame.

Filing Fault Reports

Upon navigating to the (OBS) project in JIRA, click the “create” option on the right-hand side of the top tool-bar.

When creating a ticket, make sure to fill in the following fields:

../_images/Fault_report_example_page_1.png

Screenshot of an example fault report.

  • Project: The reporter should ensure that the OBS project is selected to include all things affecting nighttime operations.

  • Issue type: If unsure, select “problem.”
    • Problem: issue type usually refers to a hardware issue.

    • Bug: issue type typically refers to a software issue.

    • Improvement: issue type refers to suggestions for improvements to a procedure, software or else.

    • Information: issue type refers to alerting the team of a new behavior. This does not immediately impact operations, but informs of a change noticed.

  • Summary: Describe the problem in one phrase. Be as clear and succinct as possible.

  • Urgent: IMPORTANT. This field is crucial to allocate time to solve a problem. If the fault obstructs observing at night, data collection, or endangers equipment, toggle this flag and alert the team as soon as possible.

  • Time lost (hr): More details about calculating time lost due to a fault are in the Guidelines For Calculating Time Loss section. Time loss is reported in the 0.1 decimal hour.

  • Components: Be as accurate as possible to select the correct component - i.e. software, hardware: M2, etc. If the component does not exist, contact Alysha Shugart and they will add it to the list.

  • Description: Provide details and a timeline as accurately as possible to help people more efficiently search telemetry logs for diagnosis. Facts first.

../_images/Fault_report_example_page_2.png

Continuing fields of an example fault report.

  • Assignee: The reporter should leave the ticket unassgined unless they are absolutely sure of the correct person to follow-up on the fault report. The team will review the fault reports after the night is over and determine the best person or group for follow-up.

  • Labels: This is not a required field, but may provide more information to the components involved.

  • Attachment: Upload any screenshots, images, or files to support the facts reported or to help the problem-solving effort.

Guidelines For Calculating Time Loss

  • If the problem can be troubleshooted while taking images on sky, or proceeding with another task, that time won’t count to fault loss.

  • If the problem happens before or after 12 degree twilight, there is no need to account for fault loss.
    • As soon as science time begins however, the clock starts ticking.

  • If the problem happens during bad weather, or no observing is taking place, there is no need to deduct for time loss.

  • It is better to overestimate than to underestimate. Sky time is very valuable - emphasize the importance to address problems in this way.

Filling Out Night Logs

More details about writing night logs are provided on the Nighttime Logging page. Concerning fault reports filed during the night, it is important that the observer lists all the problems that occurred during the night in the fault report section of the night log. This will provide higher visibility and allow to calculate total time lost to faults at the end of the observing night.

../_images/Night_log_fault_reports_list.png

List of all the fault reports that happened during the night for the night log.

Contact Personnel

This procedure was last modified Sep 23, 2022.

This procedure was written by Alysha Shugart. The following are contributors: Patrick Ingraham, Tiago Ribeiro.