www.cendio.com
Bug 4429 - Fix the load balancer
: Fix the load balancer
Status: NEW
: ThinLinc
VSM Server
: 3.4.0
: PC Unknown
: P2 Normal
: MediumPrio
Assigned To:
:
:
: 1174 4771 5268
:
  Show dependency treegraph
 
Reported: 2012-10-15 15:46 by
Modified: 2017-06-12 11:14 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2012-10-15 15:46:25
I'm sure there was already at least one bug for this, but I can't find anything
relating to the general problem (although see bugs #2196 and #1174).

Our load balancing algorithm is not great. There are a number of problems:

1) Bogomips is a strange way to measure CPU performance. Throw in things like
hyperthreading and it becomes even more problematic.

2) The general algorithm needs to be reviewed. Just because one server can
support 4000 more sessions and another can only support 1000 more, doesn't mean
that we should never start sessions on the weaker server. This also assumes
that our rating figure is meaningful in this regard.

3) The existing_users_weight parameter is backwards, i.e. the higher the value
the less each user matters.

4) Load is also affected by I/O, which isn't necessarily relevant to what we're
checking
------- Comment #2 From cendio 2012-10-23 16:47:01 -------
Moving to NearFuture, so that we remember to revisit this after 4.0.0. See
issue 13747.
------- Comment #3 From cendio 2015-03-18 15:27:50 -------
(In reply to comment #0)
> 1) Bogomips is a strange way to measure CPU performance. Throw in things like
> hyperthreading and it becomes even more problematic.
> 

Bug 4771.

> 4) Load is also affected by I/O, which isn't necessarily relevant to what we're
> checking

As mentioned, bug 1174.
------- Comment #4 From cendio 2015-03-18 15:31:36 -------
See also bug 5268. It has a rough prototype for changing the load balancer to
simply pick the agent with the fewest number of users (not sessions, nor
thinlinc users) on it. Note that it needs work as it doesn't consider varying
machine capabilities, lots of logins in a short time, nor putting all sessions
for a single user on the same agent.

(there is also still the fundamental question of what the basic principle of
the load balancer should be)
------- Comment #7 From cendio 2017-06-12 11:14:13 -------
We've had some more internal discussion about this, and we've tried to
summarise the issues and feedback we've gotten:

 * It's difficult to understand (and configure)
 * It can be overly lopsided if servers differ in (perceived) capacity
 * It doesn't spread risk
 * Some would like their own, arbitrary conditions for selecting agents

Our current system is based on the principle of giving every user as much
resources as possible, but it assumes a) that the system measures everything
relevant, b) the admin knows the resource usage and configures it accordingly.

Changes that could be made:

 * The systems tunes itself (addresses b)
 * Equal number of sessions (or users) per agent (addresses a, or balance risk
instead of load)
 * Weighted number of sessions per agent (compromise between current model and
simpler one)
 * Allow a user script to select the agent (let the customer solve the problem)