Bugzilla – Bug 5489
Verify all sessions on an agent in one call
Last modified: 2017-03-20 17:09:11
You need to
before you can comment on or make changes to this bug.
We've got quite a few problems with the number of session verification calls
that's needed to work with the default 10 minute session_update_delay setting
in clusters with lots of users.
Instead of verifying all sessions with individual calls to agents per session,
we could do better by adding a XMLRPC call/handler that checks all sessions on
an agent at the same time.
This approach could cut down the number of calls required during each
session_update_delay from the number of sessions to the number of agents.
There is a consensus about a principal design which involves one job on the
server that iterates over the sessions in the session database and groups
sessions per agent and then asks each agent to verify its sessions.
The test TestSessionOnRemovedAgent caught a change in behaviour in the new
code; we now verify existing sessions right away after the server starts.
Previously we would do so after a delay.
We need to decide if we want to keep this behaviour or not. One possible
problem could be that we are racing with the agent when starting up and might
mark those sessions as unverified.
We've also failed to implement VerifySessionsCall.handle_known_errors(), which
the test TestSessionOnDeadAgent has detected.
Works well. We've tested:
- Periodic check: alive, dead, timeout
- Reconnect: alive, dead
- Shadow: alive, dead
- HA: no scenario found (see bug 6146), we have unit tests though
- tlwebadm: alive, dead, connect/disconnect
Also checked socket usage and it seems to be doing fine on a single port per
agent (or less). I had a master/agent pair and 100 sessions on the agent. Only
port 1023 was used on the master.