Personal tools
You are here: Home Support and Documentation Administrator's Guide Chapter 6.  High Availability (HA)
Document Actions

Chapter 6.  High Availability (HA)

Chapter 6.  High Availability (HA)

6.1.  Overview

This chapter describes how to setup ThinLinc with High Availability (from now on referred to as "HA") for the VSM server, providing protection against the single point of failure that the hardware running the VSM server normally is.

This chapter describes the HA implementation avabilable in ThinLinc version 1.4 and later. The HA implementation in ThinLinc 1.2 and 1.3 is deprecated.

The basic principle behind this setup is to have two equal machines, both capable of running VSM server. If one of the machines goes down for some reason, the other machine will take over and serve VSM server requests with no or short interruption of service.

6.1.1.  Background - Reasons For a HA Setup

In a standard ThinLinc setup, there is a single point of failure - the machine running the VSM server. If the VSM server is down, no new ThinLinc connections can be made, and reconnections to existing sessions can't be established. Existing connections to VSM agent machines still running will however continue to work. A ThinLinc cluster of medium size with one machine running as VSM server and three VSM agent machines is illustrated in Figure 6.1

Figure 6.1.  A non-HA ThinLinc cluster setup

A non-HA ThinLinc cluster setup

Here the incoming connections are handled by the VSM server which distributes the connections to the three VSM agent machines. If the VSM server goes down, no new connections can occur. The VSM server is a single point of failure.

6.1.2.  Solution - Elimination of Single Point of Failure

In order to eliminate the single point of failure, we configure the VSM server in a HA configuration where two machines share the responsibility for keeping the service running.

The two machines are in constant contact with each other, each checking if the other one is up and running. If one of the machines goes down for some reason, for example hardware failure, the other machine detects the failure and automatically takes over the service with only a short interruption for the users. No action is needed from the system administrator.

6.1.3.  Theory of Operation

Figure 6.2.  A ThinLinc HA cluster setup

A ThinLinc HA cluster setup

In a HA setup, as illustrated in Figure 6.2 two equal machines are used to keep the VSM server running. One of the machines is primary, the other one is secondary. The primary machine is normally handling VSM server requests, but if it fails, the secondary machine kicks in. When the primary machine comes online again, it takes over again. That is, in normal operation, it's always the primary machine that's working, the secondary is just standby, receiving information from the primary about new and deleted sessions, maintaining its own copy of the session database.

Both machines have an unique hostname and an unique IP address, but there is also a third IP address that is active only on the node currently responsible for the VSM server service. This is the so called HA address, the address the clients are connecting to.

The software used by the machines to keep track of the other machine's status is Heartbeat, the industry standard for HA on Linux, also available for Solaris.