A status report about tracing CORBA timeouts.
Example from haspp09:
An error occurred while running Macro 'lup(th, -4.0, 4.0, 40, 2.0) -> 96d5ab3a-9f44-11e6-8795-54bef70cdbee':
     TRANSIENT CORBA system exception: TRANSIENT_CallTimedout
     p09/door/haspp09.01 [1346]: www
     Traceback (most recent call last):
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/msmacromanager.py", line 1224, in runMacro
         for step in macro_obj.exec_():
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 2272, in exec_
         res = self.run(*self._in_pars)
       File "/usr/lib/python2.7/dist-packages/sardana/sardana-macros/DESY_general/scan_macros.py", line 29, in run
         scan=self.dscan( motor, rel_start_pos, rel_final_pos, nr_interv, integ_time)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 2423, in f
         return self.execMacro(*eargs, **kwargs)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 305, in new_fn
         ret = fn(*args, **kwargs)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 1299, in execMacro
         self.runMacro(macro_obj)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 305, in new_fn
         ret = fn(*args, **kwargs)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/macro.py", line 1215, in runMacro
         return self.executor.runMacro(macro_obj)
       File "/usr/lib/python2.7/dist-packages/sardana/macroserver/msmacromanager.py", line 1278, in runMacro
         raise macro_exp
     MacroServerException: TRANSIENT CORBA system exception: TRANSIENT_CallTimedout
Actions:
Timeouts at haspp09:
! 2016-10-30 06:00:00 hklscan(-0.1, -0.9, 0.1, 0.9, 11.0, 11.0, 30, 5.0) -> 85c2bc4e-9e5d-11e6-8795-54bef70cdbee 304.25 1.0 ! 2016-10-30 06:10:06 hklscan(-0.1, -0.9, 0.1, 0.9, 11.0, 11.0, 30, 5.0) -> e2b13b6e-9e5e-11e6-8795-54bef70cdbee 304.257013889 1.0 ! 2016-10-30 16:29:19 hklscan(-1.0, 1.0, 1.0, -1.0, 10.5, 10.5, 150, 40.0) -> 745bdce2-9eb2-11e6-8795-54bef70cdbee 304.687025463 1.0 ! 2016-10-30 17:50:09 hklscan(-1.0, 1.0, 1.0, -1.0, 10.5, 10.5, 150, 40.0) -> ad9aecfe-9eb8-11e6-8795-54bef70cdbee 304.743159722 1.0 ! 2016-10-30 23:10:58 hklscan(-0.65, -0.35, 0.65, 0.35, 6.0, 6.0, 60, 2.0) -> 95629cba-9eed-11e6-8795-54bef70cdbee 304.965949074 1.0 ! 2016-10-31 09:33:04 lup(th, -4.0, 4.0, 40, 2.0) -> 96d5ab3a-9f44-11e6-8795-54bef70cdbee 305.397962963 1.0 ! 2016-10-31 10:14:35 hklscan(-0.65, -0.35, 0.65, 0.35, 6.0, 6.0, 60, 2.0) -> 54858a56-9f4a-11e6-8795-54bef70cdbee 305.426793981 1.0 ! 2016-10-31 12:21:01 hklscan(-1.0, 0.0, 1.0, 0.0, 10.5, 10.5, 75, 4.0) -> 1d8b4412-9f5b-11e6-8795-54bef70cdbee 305.514594907 1.0 ! 2016-10-31 13:02:37 hklscan(-1.0, 0.0, 1.0, 0.0, 10.5, 10.5, 75, 40.0) -> 2d471236-9f5c-11e6-8795-54bef70cdbee 305.543483796 1.0 ! 2016-10-31 13:14:03 hklscan(-1.0, 0.0, 1.0, 0.0, 10.5, 10.5, 75, 40.0) -> 9288ea92-9f62-11e6-8795-54bef70cdbee 305.551423611 1.0 ! 2016-10-31 17:00:57 hklscan(-1.0, 0.0, 1.0, 0.0, 10.5, 10.5, 75, 40.0) -> 28a29b8a-9f82-11e6-8795-54bef70cdbee 305.708993056 1.0 ! 2016-11-01 09:16:34 hklscan(-1.0, 0.0, 1.0, 0.0, 10.5, 10.5, 75, 13.0) -> de63ab7c-a008-11e6-8795-54bef70cdbee 306.38650463 1.0 ! 2016-11-01 19:57:51 shclose() -> 149c58c2-a065-11e6-8f57-54bef70cdbee 306.831840278 1.0 ! 2016-11-02 17:58:09 ]]) -> 865c99b4-a11d-11e6-8795-54bef70cdbee 307.748715278 1.0 ! 2016-11-03 01:12:11 shopen() -> 1f13095e-a15a-11e6-8795-54bef70cdbee 308.050127315 1.0 ! 2016-11-03 09:22:12 shopen() -> 9b73137e-a19e-11e6-8795-54bef70cdbee 308.390416667 1.0 ! 2016-11-03 10:46:17 shopen() -> 552ea0ac-a1aa-11e6-831e-54bef70cdbee 308.44880787 1.0 ! 2016-11-03 13:43:46 shclose() -> 272f0d90-a1c3-11e6-831e-54bef70cdbee 308.572060185 1.0 ! 2016-11-03 13:44:06 shclose() -> 33ac43c6-a1c3-11e6-831e-54bef70cdbee 308.572291667 1.0 ! 2016-11-03 13:44:11 shclose() -> 368fa2ea-a1c3-11e6-831e-54bef70cdbee 308.572349537 1.0
Timeouts at haspp08:
! 2016-10-31 18:17:16 ascan(sl1v, 0.0, 0.25, 25, 0.1) -> d197f4e6-9f8d-11e6-8e2e-e069950e8aea 305.761990741 1.0 ! 2016-10-31 18:18:57 ascan(sl1v, 0.0, 0.25, 25, 0.1) -> 0e09286e-9f8e-11e6-8e2e-e069950e8aea 305.763159722 1.0 ! 2016-10-31 18:23:12 ascan(sl1v, 0.0, 0.25, 25, 0.1) -> a5f250a6-9f8e-11e6-8e2e-e069950e8aea 305.766111111 1.0 ! 2016-10-31 18:24:10 ascan(sl1v, 0.0, 0.25, 25, 0.1) -> cfc71bfa-9f8e-11e6-8e2e-e069950e8aea 305.766782407 1.0 ! 2016-10-31 18:25:51 ascan(sl1v, 0.0, 0.25, 25, 0.1) -> 0df7c302-9f8f-11e6-8e2e-e069950e8aea 305.767951389 1.0 ! 2016-10-31 20:24:04 ascan(sl2v, 0.3, 0.6, 30, 0.1) -> 909d3c64-9f9f-11e6-8e2e-e069950e8aea 305.850046296 1.0 ! 2016-10-31 23:12:11 dscan(xs, -2.0, 2.0, 20, 1.0) -> 05509fc6-9fb7-11e6-8e2e-e069950e8aea 305.966793981 1.0 ! 2016-11-01 02:05:14 stabi_macro() -> 3beed8aa-9fcf-11e6-8e2e-e069950e8aea 306.086967593 1.0 ! 2016-11-01 08:57:03 ]]) -> c51ae248-a008-11e6-805a-e069950e8aea 306.372951389 1.0
Affected Macros: hklscan, shclose, shopen, lup, ascan, dscan, stabi_macro, not just one.
Overall performance indicator: response time 'ssh <node> hostname'
Conclusion: 'ssh' response time is not correlated with the CORBA timeouts.
More specific performance indicators: idle time, memory usage, swap file usage:
Conclusion: there is some hint that the PC utilization causes timeout.
Further actions:
Possible outcome: