6.4. Testing Correct Failover Behavior
After installing ThinLinc in a HA configuration, tests should be made to make sure failover works without problems. It's important to test this functionality with a system with non-faulty hardware, in order to know that it will work as it should if hardware failures should occur.
As a first test, stop Heartbeat on the primary node. If everything works as expected, the secondary node will take over operations in a few seconds (when Heartbeat is shut down, it sends a signal to the other end that it is really shutting down, meaning the other heartbeat can react very quickly).
[root@tlha-primary root] /etc/init.d/heartbeat stop
Verify that the secondary node is now active by checking that the HA interface is up. Make sure connections using tlclient succeeds.
Starting Heartbeat again on the primary node should make the primary node the active node within a minute.
[root@tlha-primary root] /etc/init.d/heartbeat start
Verify operations again.
A harder test of the failover capability is to simply turn off the active node, for example by pulling the power plug. When this happens, the secondary node should detect the failure and take over operations. Verify this and turn on the primary node again. It should take over operations almost instantly.
Please note that if the VSM server machines are also used as VSM agents, care should be taken to make sure that no sessions reside on a machine that is turned off for testing purposes.
A less drastic test is to unplug the network cable of the primary node. The secondary node will detect that it can't reach the primary, and take over service. When reconnecting the primary node to the network, both hosts will for a very short moment have the HA interface up and running, but they will quickly find out that they have contact again, and the secondary will take down its interface.