The ECMonitor.py is a tool monitoring the status of selected control system components (MacroServers, Pool, Doors, ActiveMntGrp). Devices have to online, not in FAULT and not in ALARM. The ECMOnitor notifies users via email or sms, if a problem occurs.
$ ECMonitor.py Usage: ECMonitor.py -x [-v] [-l [-t <timeSleep> ]] Checks the status of the MacroServers, the Pool, the Doors, the ActiveMntGrp and the elements of the ActiveMntGrp. The user is notified, if a device is not exported or in ALARM or FAULT state. A message is sent for each transition from OK to NOT OK. -x the check is executed once, the errors are displayed, no notifications -x -v the check is executed once, verbose output -x -l the check is repeatedly executed -x -l -n firstname.name@desy.de -x -l -n firstname.name@desy.de,sms/0049123123123@sms.desy.de Options: -h, --help show this help message and exit -x execute -l execute repeatedly -n NOTIFY comma separated notify list, no blanks -t TIMESLEEP sleep time when looping, def. 60s
Before you run the application repeatedly, you may want to execute it in the single-shot mode:
$ ECMonitor.py -x -v 01 Dec 2021 11:34:53 p99/pool/haso107tk, state tango._tango.DevState.ON p99/macroserver/haso107tk.01, state tango._tango.DevState.ON p99/door/haso107tk.01, state tango._tango.DevState.ON p99/door/haso107tk.02, state tango._tango.DevState.ON p99/door/haso107tk.03, state tango._tango.DevState.ON ActiveMntGrp: mg_ivp eh_t01, state tango._tango.DevState.ON eh_mca01, state tango._tango.DevState.ON eh_c01, state tango._tango.DevState.ON sig_gen, state tango._tango.DevState.ON eh_c02, state tango._tango.DevState.ON pilatus, state tango._tango.DevState.ON MacroServers, Pool, Doors, ActiveMntGrp are okThis command displays all errors. After they have been fixed the tool can be executed repeatedly:
$ ECMonitor.py -x -l -n your.name@desy.de,sms/0049123123123@sms.desy.de
This way the procedure is executed in a loop. The default wait time is 60s. If errors occur the listed addresses are notified. The notifaction happens only once when the status changes from 'good' to 'error'.