DR Failover and Recovery in a 2 Node Cluster¶
Important
A 2 node cluster will not fail over automatically.
With only two Unified nodes, with or without Web proxies, there is no High Availability. The database on the primary node is read/write, while the database on the secondary is read only.
Only redundancy is available.
- If the primary node fails, a manual delete of the primary node on the secondary and a cluster provision will be needed.
- If the secondary node fails, it needs to be replaced.
Scenario: Loss of Primary Node¶
The administrator deployed the 2-node cluster.
$ cluster status Data Centre: jhb application : AS01[172.29.42.100] AS02[172.29.42.101] webproxy : AS01[172.29.42.100] AS02[172.29.42.101] database : AS01[172.29.42.100] AS02[172.29.42.101]
Example database weights:
$ database weight list 172.29.42.100: weight: 20 172.29.42.101: weight: 10
Node Failure: in the case where the primary node is lost on the Primary site:
$ cluster status Data Centre: unknown application : unknown_172.29.248.100[172.29.248.100] (not responding) webproxy : unknown_172.29.248.100[172.29.248.100] (not responding) database : unknown_172.29.248.100[172.29.248.100] (not responding) Data Centre: jhb application : AS02[172.29.248.101] webproxy : AS02[172.29.248.101] database : AS02[172.29.248.101]
Recovery Steps¶
The primary node server is lost.
It is decided to fail over to the secondary node:
On the secondary node, remove the lost server from the cluster:
cluster del 172.29.248.100
On the secondary node, run cluster provision (it is recommended that this step is run in a terminal opened with the screen command).
On the secondary node, check:
$ cluster status Data Centre: jhb application : AS02[172.29.248.101] webproxy : AS02[172.29.248.101] database : AS02[172.29.248.101]
It is decided to recover the primary node:
On the secondary node, remove the lost server from the cluster:
cluster del 172.29.248.100
On the secondary node, run cluster provision (it is recommended that this step is run in a terminal opened with the screen command).
On the secondary node, check:
$ cluster status Data Centre: jhb application : AS02[172.29.248.101] webproxy : AS02[172.29.248.101] database : AS02[172.29.248.101]
Switch on the newly installed server.
On the secondary node, add the server. Run cluster add 172.29.42.100.
On either node, check:
$ cluster status Data Centre: jhb application : AS01[172.29.42.100] AS02[172.29.42.101] webproxy : AS01[172.29.42.100] AS02[172.29.42.101] database : AS01[172.29.42.100] AS02[172.29.42.101]
Configure the primary database. On the newly installed server, run cluster provision primary 172.29.42.100 (it is recommended that this step is run in a terminal opened with the screen command).
Check database configuration on both nodes, for example:
$ database config date: $date: 1549450382862 heartbeatIntervalMillis: 2000 members: 172.29.42.100:27020: priority: 20.0 stateStr: PRIMARY storageEngine: WiredTiger 172.29.42.100:27030: priority: 1.0 stateStr: ARBITER storageEngine: Unknown 172.29.42.101:27020: priority: 10.0 stateStr: SECONDARY storageEngine: WiredTiger myState: 1 ok: 1.0 set: DEVICEAPI term: 8
If an OVA file was not available for your current release and you used the most recent release OVA for which there is an upgrade path to your release to create the new unified node, re-apply the Delta Bundle upgrade to the cluster.
Note that the new node version mismatch in the cluster can be ignored, since this upgrade step aligns the versions.
See: Upgrade