High availability overview¶
Background — reasons for a HA setup¶
In a standard ThinLinc setup, there is a single point of failure — the master server. If the master is down, no new ThinLinc connections can be made, and reconnections to existing sessions can’t be established. Existing sessions still connected to the agent will however continue to work. A ThinLinc cluster with one master and three agent servers is illustrated in Fig. 3.
Fig. 3 A non-HA ThinLinc cluster setup¶
Here, the incoming connections are handled by the master which distributes the connections to the three agents. If the master goes down, no new connections can occur. The master is the single point of failure in a non-HA ThinLinc cluster.
Solution — elimination of single point of failure¶
In order to eliminate the single point of failure, we configure the cluster for HA with two redundant master servers. Note that ThinLinc’s HA functionality only handles the parts of your HA setup that keeps the ThinLinc session database synchronized between the two master servers. Supplementary software is required, read more about this in Theory of operation.
When ThinLinc as well as your systems are configured this way, the two master servers are in constant contact with each other, each checking if the other one is up and running. If one of the masters goes down for some reason, for example hardware failure, the other machine detects the failure and automatically takes over the service with only a short interruption for the users. No action is needed from the system administrator.
Theory of operation¶
Fig. 4 A ThinLinc HA cluster setup¶
In a HA setup, as illustrated in Fig. 4 two redundant machines are acting as master servers. One of the machines is primary, the other one is secondary. The primary machine is normally handling master server requests, but if it fails, the secondary machine takes over. When the primary machine comes online, it takes over again. That is, in normal operation, it’s always the primary machine that’s working, the secondary is in standby, receiving information from the primary about new and deleted sessions, maintaining its own copy of the session database.
Both servers have a unique hostname and a unique IP address, but there is also a third IP address that is active only on the node currently responsible for the master service. This is usually referred to as a resource IP address, which the clients are connecting to. ThinLinc does not move this resource IP address between servers. Therefore, supplementary software is required for this purpose.