Bug 6142 - inefficient call of ps/netstat in verify_sessions handler
Summary: inefficient call of ps/netstat in verify_sessions handler
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VSM Agent (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.8.0
Assignee: Henrik Andersson
URL:
Keywords: ossman_tester, relnotes, samuel_tester
Depends on:
Blocks:
 
Reported: 2017-01-19 11:55 CET by Samuel Mannehed
Modified: 2017-05-19 14:42 CEST (History)
3 users (show)

See Also:
Acceptance Criteria:


Attachments

Description Samuel Mannehed cendio 2017-01-19 11:55:39 CET
We call netstat once for each session, even though the information is relevant for all sessions. IOW we could optimise things to one invocation per XMLRPC call.
Comment 1 Bojan Memetovic cendio 2017-01-24 10:39:53 CET
After the work done on bug 5489 all netstat calls are run within the same timeout. We have decided that we will take a look at this as a continuation of that bug.
Comment 2 Karl Mikaelsson cendio 2017-01-30 11:01:33 CET
See also: Bug 6125.
Comment 3 Henrik Andersson cendio 2017-01-30 13:31:26 CET
For each session to verify, we first makes a call to ps for validating that the process is an tl-session / xinit process. Then we make a call to netstat to verify if a client is connected to the session.

We could do the checks in one call for each subprocess (ps, netstat). First a call to ps wtih a set of sessions to verify, failed sessions are added to result and removed from the set of sessions. Next pass being running netstat with the remaining sessions in the the set, adding them to result as succesfully verified sessions.
Comment 4 Henrik Andersson cendio 2017-01-30 14:15:59 CET
Another approach for optimizing the ps part is to read the link /proc/%d/exe to verify which binary is running as pid %d.
Comment 8 Pierre Ossman cendio 2017-02-06 14:20:08 CET
Looks good. Verified using ptrace that only a single invocation of 'ss' is made, no matter the number of sessions. It also updates the connection status properly.
Comment 9 Karl Mikaelsson cendio 2017-02-06 15:08:56 CET
The use of any() is limited to Python 2.5 and newer. 

Tracebacks on RHEL5 (with Python 2.4):

> 2017-02-06 11:20:48 ERROR vsmagent: Unhandled exception with async process: exceptions.NameError global name 'any' is not defined Traceback (most recent call last):
>   File "/opt/thinlinc/modules/thinlinc/vsm/async.py", line 103, in i1Iii
>     obj . handle_read_event ( fd )
>   File "/opt/thinlinc/modules/thinlinc/vsm/extproc.py", line 311, in handle_read_event
>     self . handle_close ( fd )
>   File "/opt/thinlinc/modules/thinlinc/vsm/extproc.py", line 284, in handle_close
>     self . handle_exit ( )
>   File "/opt/thinlinc/modules/thinlinc/vsm/extproc.py", line 457, in handle_exit
>     self . callback ( O00oO000O0O , self . stdout_data , self . stderr_data )
>   File "/opt/thinlinc/modules/thinlinc/vsm/handler_verifysessions.py", line 28, in <lambda>
>     o00 = lambda Oo0oO0ooo , o0oOoO00o , i1 : self . _parse_ss_output ( Oo0oO0ooo , o0oOoO00o , i1 , IiiIII111iI )
>   File "/opt/thinlinc/modules/thinlinc/vsm/handler_verifysessions.py", line 125, in _parse_ss_output
>     self . _process_sessions ( sessions , Ii1IOo0o0 )
>   File "/opt/thinlinc/modules/thinlinc/vsm/handler_verifysessions.py", line 88, in _process_sessions
>     if any ( [ oo . endswith ( ":%d" % IiiiI1II1I1 ) for oo in listeners ] ) :
> NameError: global name 'any' is not defined
Comment 11 Pierre Ossman cendio 2017-02-08 14:54:46 CET
Works now. Retested on RHEL 5 and RHEL 6.
Comment 12 Pierre Ossman cendio 2017-05-18 10:53:16 CEST
Upgrading from 4.7.0 on RHEL 5 loses sessions, and it seems to be caused by the code changed in this bug:

> 2017-05-18 10:49:15 WARNING vsmagent.session: Broken session for user cendio, tl-session pid 4869 is not tl-session

> [root@dhcp-253-137 tl-4.8.0-server]# ps fax| grep 4869
>  6401 pts/1    S+     0:00  |       \_ grep 4869
>  4869 ?        S      0:00 tl-session: cendio
Comment 13 Pierre Ossman cendio 2017-05-18 11:16:58 CEST
It seems like older kernels behave slightly different for deleted files. This is the contents of the 'exe' symlink:

  /opt/thinlinc/libexec/tl-session\x00591d636f (deleted)

The (illegal) null in there causes most tools to not display the trailing stuff. Unfortunately strings aren't null terminated in Python so it gets confused by this.
Comment 17 Pierre Ossman cendio 2017-05-19 10:17:32 CEST
Tested on RHEL 5 and reconnect works well both for upgraded and new session.
Comment 18 Pierre Ossman cendio 2017-05-19 10:58:39 CEST
Tested on Ubuntu 16.04 and reconnect works well both for upgraded and new session.
Comment 19 Samuel Mannehed cendio 2017-05-19 14:42:07 CEST
Tested rc2 on RHEL7 and reconnect works well both for upgraded and new session. Closing.

Note You need to log in before you can comment on or make changes to this bug.