Election of a New Primary and Failover¶
In the case where unified nodes fail, the system follows a failover procedure. For details on the failover and DR process, refer to the topics in the Platform Guide.
If the primary database is lost, the failover process involves the election of a new primary database by the remaining database nodes. Each node in a cluster is allocated a number of votes that are used in the failover election of a new primary database - the election of a running node with the highest database weight.
The database weights for a node can be seen as the priority
value when
running the database config command. Note that database weight of a node
does not necessarily match its number of votes.
$ database config date: 2016-04-25T09:50:34Z members: 172.29.21.101:27020: priority: 16 stateStr: PRIMARY 172.29.21.101:27030: stateStr: ARBITER 172.29.21.102:27020: priority: 8 stateStr: SECONDARY 172.29.21.102:27030: stateStr: ARBITER 172.29.21.103:27020: priority: 4 stateStr: SECONDARY 172.29.21.103:27030: stateStr: ARBITER 172.29.21.104:27020: priority: 2 stateStr: SECONDARY myState: 1 ok: 1 set: DEVICEAPI
The maximum number of votes in a cluster should not exceed 7 and arbiter votes are added to nodes to provide a total of 7 votes.
The tables below show the system status and failover for a selection of scenarios for 6 node and 8 node clusters. Also refer to the topics on the specific DR scenarios. The abbreviations used are as follows:
Pri : Primary site
DR : DR site
N : node. Primary node is N1, secondary node is N2.
w : database weight
v : vote
a : arbiter vote
Not all scenarions are listed for 8 node clusters and example weights have been allocated.
For a 6 node cluster with 4 database nodes and 2 sites, initial votes are as follows:
Primary database node, nodes 2-3: 2 (1 + 1 arbiter) Secondary database nodes 4: 1 (no arbiter)
Pri N1 w:40 v:1 a:1 |
Pri N2 w:30 v:1 a:1 |
DR N3 w:20 v:1 a:1 |
DR N4 w:10 v:1 |
Votes |
System Status under scenario |
---|---|---|---|---|---|
Up |
Up |
Up |
Up |
7 |
System is functioning normally. |
Up |
Up |
Up |
Down |
6 |
Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally. |
Up |
Up |
Down |
Up |
6 |
Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally. |
Up |
Down |
Up |
Up |
6 |
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally. |
Down |
Up |
Up |
Up |
5 |
Scenario: Loss of the Primary Database Server. Some downtime occurs. System automatically fails over to N2. |
Down |
Down |
Up |
Up |
3 |
Scenario: Loss of a Primary Site. Manual recovery required |
Up |
Up |
Down |
Down |
4 |
System continues functioning normally. |
Up |
Down |
Down |
Up |
3 |
Manual recovery required |
Up |
Down |
Up |
Down |
4 |
System continues functioning normally. |
For an 8 node cluster with 6 database nodes and 2 sites, initial votes are as follows:
Primary database node: 2 (1 + 1 arbiter voting member) Secondary database nodes total: 5 (no arbiter votes)
The table here shows a representative selection of scenarios.
Pri N1 w:60 v:1 a:1 |
Pri N2 w:50 v:1 |
Pri N3 w:40 v:1 |
Pri N4 w:30 v:1 |
DR N5 w:20 v:1 |
DR N6 w:10 v:1 |
Votes |
System Status under scenario |
---|---|---|---|---|---|---|---|
Up |
Up |
Up |
Up |
Up |
Up |
7 |
System is functioning normally. |
Up |
Up |
Up |
Down |
Down |
Down |
4 |
Scenarios: Loss of a Non-primary Node in the Primary and Secondary Site. System continues functioning normally. |
Up |
Up |
Up |
Up |
Down |
Up |
6 |
Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally. |
Up |
Down |
Up |
Up |
Up |
Up |
6 |
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally. |
Up |
Down |
Down |
Up |
Up |
Up |
6 |
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally. |
Down |
Up |
Up |
Up |
Up |
Up |
6 |
Scenario: Loss of the Primary Database Server. Some downtime occurs. System automatically fails over to N2. |
Down |
Down |
Up |
Up |
Up |
Up |
4 |
Some downtime occurs. System automatically fails over to N3. |
Down |
Down |
Down |
Up |
Up |
Up |
3 |
Manual recovery required |
Down |
Down |
Down |
Down |
Up |
Up |
2 |
Scenario: Loss of a Primary Site. Manual recovery required |
Up |
Up |
Down |
Up |
Up |
Up |
6 |
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally. |
Up |
Up |
Down |
Down |
Up |
Up |
5 |
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally. |
Up |
Up |
Down |
Down |
Down |
Up |
4 |
Scenarios: Loss of a Non-primary Node in the Primary and Secondary Site. System continues functioning normally. |
Up |
Up |
Down |
Down |
Down |
Down |
3 |
Manual recovery required |
Up |
Down |
Up |
Down |
Down |
Down |
3 |
Manual recovery required |
As the represenative table above shows, the 8 node status and scenarios are similar for a number of permutations of nodes. For example, the failure of a single node N2, N3 or N4 results in the same scenario:
Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
The list below shows individual nodes (N1 to N6) and groups of nodes that will result in the same failover scenario.
Upon recovery, there is typically a delay of 10-20 minutes in the continuance of transaction processing.
N2, N3, N4
N5, N6
N2+N3, N2+N4, N3+N4
N1+N2+N3, N1+N2+N4, N1+N3+N4
N1+N5, N1+N6
N2+N5, N2+N6, N3+N5, N3+N6, N4+N5, N4+N6
N2+N3+N4
N2+N3+N5, N2+N3+N6, N2+N4+N5, N2+N4+N6, N3+N4+N5, N3+N4+N6
N5+N6
A failure in other groupings will require a manual recovery, for example, in such groups as:
N1+N2+N3, N1+N2+N4, N1+N2+N5, N1+N2+N6, N1+N3+N4, N1+N3+N5, N1+N3+N6, N1+N4+N5, N1+N4+N6, N1+N5+N6
N2+N3+N4+N5, N2+N3+N4+N6, N3+N4+N5+N6
N1+N2+N3+N4, N1+N2+N3+N5, N1+N2+N3+N6, N1+N3+N4+N5, N1+N3+N4+N6, N1+N4+N5+N6
N1+N2+N3+N4+N5, N1+N2+N3+N4+N6