6.3.  Recovering from hardware failures

If situations occur where the secondary node has been forced to take over service because the primary node failed for some reason, it's important to know how to recover.

6.3.1.  Recovering from Minor Failures

If the primary went down because of a minor failure (overheating trouble, faulty processor, faulty memory etc.) and the contents of the files in /var/lib/vsm are untouched, recovery is very simple and fully automatic. Simply start the server and let the two VSM servers resynchronize with eachother.

6.3.2.  Recovering from Catastrophic Failure

If a catastrophic failure has occured, and no data on the disks of the primary can be recovered, ThinLinc needs to be reinstalled and HA must be reinitialized.

Install ThinLinc as described in Section 6.2.1, “ Installation of a New HA Cluster ”, but before starting the VSM server after enabling HA in the configuration file, copy the file /var/lib/vsm/sessions from the secondary to the primary. That will preload the database of active sessions with more current values on the primary.