ECMonitor

The ECMonitor.py is a tool monitoring the status of selected control system components (MacroServers, Pool, Doors, ActiveMntGrp). Devices have to online, not in FAULT and not in ALARM. The ECMOnitor notifies users via email or sms, if a problem occurs.

$ ECMonitor.py 
Usage: ECMonitor.py -x [-v] [-l [-t <timeSleep> ]] 
  Checks the status of the MacroServers, the Pool, the Doors, the ActiveMntGrp
  and the elements of the ActiveMntGrp. The user is notified, if a device is
  not exported or in ALARM or FAULT state.
  A message is sent for each transition from OK to NOT OK.
  -x    the check is executed once, the errors are displayed, no notifications
  -x -v the check is executed once, verbose output
  -x -l the check is repeatedly executed 
  -x -l -n firstname.name@desy.de
  -x -l -n firstname.name@desy.de,sms/0049123123123@sms.desy.de

Options:
  -h, --help    show this help message and exit
  -x            execute
  -l            execute repeatedly
  -n NOTIFY     comma separated notify list, no blanks
  -t TIMESLEEP  sleep time when looping, def. 60s

Before you run the application repeatedly, you may want to execute it in the single-shot mode:

$ ECMonitor.py -x -v
01 Dec 2021 11:34:53
p99/pool/haso107tk, state tango._tango.DevState.ON
p99/macroserver/haso107tk.01, state tango._tango.DevState.ON
p99/door/haso107tk.01, state tango._tango.DevState.ON
p99/door/haso107tk.02, state tango._tango.DevState.ON
p99/door/haso107tk.03, state tango._tango.DevState.ON
ActiveMntGrp: mg_ivp
  eh_t01, state tango._tango.DevState.ON
  eh_mca01, state tango._tango.DevState.ON
  eh_c01, state tango._tango.DevState.ON
  sig_gen, state tango._tango.DevState.ON
  eh_c02, state tango._tango.DevState.ON
  pilatus, state tango._tango.DevState.ON
MacroServers, Pool, Doors, ActiveMntGrp are ok
This command displays all errors. After they have been fixed the tool can be executed repeatedly:

$ ECMonitor.py -x -l -n your.name@desy.de,sms/0049123123123@sms.desy.de

This way the procedure is executed in a loop. The default wait time is 60s. If errors occur the listed addresses are notified. The notifaction happens only once when the status changes from 'good' to 'error'.