Election of a New Primary and Failover¶

In the case where unified nodes fail, the system follows a failover procedure. For details on the failover and DR process, refer to the topics in the Platform Guide.

If the primary database is lost, the failover process involves the election of a new primary database by the remaining database nodes. Each node in a cluster is allocated a number of votes that are used in the failover election of a new primary database - the election of a running node with the highest database weight.

The database weights for a node can be seen as the priority value when running the database config command. Note that database weight of a node does not necessarily match its number of votes.

$ database config
    date: 2016-04-25T09:50:34Z
    members:
        172.29.21.101:27020:
            priority: 16
            stateStr: PRIMARY
        172.29.21.101:27030:
            stateStr: ARBITER
        172.29.21.102:27020:
            priority: 8
            stateStr: SECONDARY
        172.29.21.102:27030:
            stateStr: ARBITER
        172.29.21.103:27020:
            priority: 4
            stateStr: SECONDARY
        172.29.21.103:27030:
            stateStr: ARBITER
        172.29.21.104:27020:
            priority: 2
            stateStr: SECONDARY
    myState: 1
    ok: 1
    set: DEVICEAPI

The maximum number of votes in a cluster should not exceed 7 and arbiter votes are added to nodes to provide a total of 7 votes.

The tables below show the system status and failover for a selection of scenarios for 6 node and 8 node clusters. Also refer to the topics on the specific DR scenarios. The abbreviations used are as follows:

Pri : Primary site
DR : DR site
N : node. Primary node is N1, secondary node is N2.
w : database weight
v : vote
a : arbiter vote

Not all scenarions are listed for 8 node clusters and example weights have been allocated.

For a 6 node cluster with 4 database nodes and 2 sites, initial votes are as follows:

Primary database node, nodes 2-3: 2 (1 + 1 arbiter) Secondary database nodes 4: 1 (no arbiter)

Pri N1 w:40 v:1 a:1	Pri N2 w:30 v:1 a:1	DR N3 w:20 v:1 a:1	DR N4 w:10 v:1	Votes	System Status under scenario
Up	Up	Up	Up	7	System is functioning normally.
Up	Up	Up	Down	6	Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally.
Up	Up	Down	Up	6	Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally.
Up	Down	Up	Up	6	Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
Down	Up	Up	Up	5	Scenario: Loss of the Primary Database Server. Some downtime occurs. System automatically fails over to N2.
Down	Down	Up	Up	3	Scenario: Loss of a Primary Site. Manual recovery required
Up	Up	Down	Down	4	System continues functioning normally.
Up	Down	Down	Up	3	Manual recovery required
Up	Down	Up	Down	4	System continues functioning normally.

For an 8 node cluster with 6 database nodes and 2 sites, initial votes are as follows:

Primary database node: 2 (1 + 1 arbiter voting member) Secondary database nodes total: 5 (no arbiter votes)

The table here shows a representative selection of scenarios.

Pri N1 w:60 v:1 a:1	Pri N2 w:50 v:1	Pri N3 w:40 v:1	Pri N4 w:30 v:1	DR N5 w:20 v:1	DR N6 w:10 v:1	Votes	System Status under scenario
Up	Up	Up	Up	Up	Up	7	System is functioning normally.
Up	Up	Up	Down	Down	Down	4	Scenarios: Loss of a Non-primary Node in the Primary and Secondary Site. System continues functioning normally.
Up	Up	Up	Up	Down	Up	6	Scenario: Loss of a Non-primary Server in the DR Site. System continues functioning normally.
Up	Down	Up	Up	Up	Up	6	Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
Up	Down	Down	Up	Up	Up	6	Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
Down	Up	Up	Up	Up	Up	6	Scenario: Loss of the Primary Database Server. Some downtime occurs. System automatically fails over to N2.
Down	Down	Up	Up	Up	Up	4	Some downtime occurs. System automatically fails over to N3.
Down	Down	Down	Up	Up	Up	3	Manual recovery required
Down	Down	Down	Down	Up	Up	2	Scenario: Loss of a Primary Site. Manual recovery required
Up	Up	Down	Up	Up	Up	6	Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
Up	Up	Down	Down	Up	Up	5	Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.
Up	Up	Down	Down	Down	Up	4	Scenarios: Loss of a Non-primary Node in the Primary and Secondary Site. System continues functioning normally.
Up	Up	Down	Down	Down	Down	3	Manual recovery required
Up	Down	Up	Down	Down	Down	3	Manual recovery required

As the represenative table above shows, the 8 node status and scenarios are similar for a number of permutations of nodes. For example, the failure of a single node N2, N3 or N4 results in the same scenario:

Scenario: Loss of a Non-primary Node in the Primary Site. System continues functioning normally.

The list below shows individual nodes (N1 to N6) and groups of nodes that will result in the same failover scenario.

Upon recovery, there is typically a delay of 10-20 minutes in the continuance of transaction processing.

N2, N3, N4
N5, N6
N2+N3, N2+N4, N3+N4
N1+N2+N3, N1+N2+N4, N1+N3+N4
N1+N5, N1+N6
N2+N5, N2+N6, N3+N5, N3+N6, N4+N5, N4+N6
N2+N3+N4
N2+N3+N5, N2+N3+N6, N2+N4+N5, N2+N4+N6, N3+N4+N5, N3+N4+N6
N5+N6

A failure in other groupings will require a manual recovery, for example, in such groups as:

N1+N2+N3, N1+N2+N4, N1+N2+N5, N1+N2+N6, N1+N3+N4, N1+N3+N5, N1+N3+N6, N1+N4+N5, N1+N4+N6, N1+N5+N6
N2+N3+N4+N5, N2+N3+N4+N6, N3+N4+N5+N6
N1+N2+N3+N4, N1+N2+N3+N5, N1+N2+N3+N6, N1+N3+N4+N5, N1+N3+N4+N6, N1+N4+N5+N6
N1+N2+N3+N4+N5, N1+N2+N3+N4+N6

Election of a New Primary and Failover¶

Previous topic

Next topic

This Page