Understand set of operations for Managing Failover
Following set of operations are available for managing the failover:
Normal Operation: In the normal mode of operation, the machine with highest priority will act as master machine and the other machine will act as slave machine. The master does the discovery of nodes and uploads its database. The slave machine gets the database snapshot from the master machine periodically. The master and slave machine exchange heartbeat signal every n seconds (configurable).
Machine Failure or Application Failure: In case of machine failure or the application failure, the slave machine will not receive consecutive heartbeats, slave machine examine the network conditions and if master machine has failed, the slave machine exchange general take over messages with priority, and finally the one with highest priority will become master. When the previous master is restored, it behaves as slave by taking a snapshot from the existing master and start exchanging heartbeat with the master.
Link Failure: In case of link failure between the master and the slave machine, the slave machine will not receive consecutive heartbeats. The node traps are still sent to the master machine as there is no failure in the machine. There is no switch over from master to slave; the master continues to be master machine. In case of link failure between the master and nodes, the node traps are sent to the slave machine. There is switch over from master to slave. The slave machine will be the master machine.
Forced Switch: You can manage the failover condition by performing Hotstandby switchover.