DR Failover and Recovery in a 2 Node Cluster¶

Important

A 2 node cluster will not fail over automatically.

With only two Unified nodes, with or without Web proxies, there is no High Availability. The database on the primary node is read/write, while the database on the secondary is read only.

Only redundancy is available.

If the primary node fails, a manual delete of the primary node on the secondary and a cluster provision will be needed.
If the secondary node fails, it needs to be replaced.

Scenario: Loss of Primary Node¶

The administrator deployed the 2-node cluster.

$ cluster status

Data Centre: jhb
             application : AS01[172.29.42.100]
                           AS02[172.29.42.101]

             webproxy :    AS01[172.29.42.100]
                           AS02[172.29.42.101]

             database :    AS01[172.29.42.100]
                           AS02[172.29.42.101]

Example database weights:

$ database weight list
     172.29.42.100:
         weight: 20
     172.29.42.101:
         weight: 10

Node Failure: in the case where the primary node is lost on the Primary site:

$ cluster status

Data Centre: unknown
             application : unknown_172.29.248.100[172.29.248.100] (not responding)

             webproxy : unknown_172.29.248.100[172.29.248.100] (not responding)

             database : unknown_172.29.248.100[172.29.248.100] (not responding)

Data Centre: jhb
             application : AS02[172.29.248.101]

             webproxy : AS02[172.29.248.101]

             database : AS02[172.29.248.101]

Recovery Steps¶

The primary node server is lost.

It is decided to fail over to the secondary node:
1. On the secondary node, remove the lost server from the cluster:
  
  cluster del 172.29.248.100
2. On the secondary node, run cluster provision (it is recommended that this step is run in a terminal opened with the screen command).
  
  On the secondary node, check:
```
$ cluster status

Data Centre: jhb
             application : AS02[172.29.248.101]

             webproxy : AS02[172.29.248.101]

             database : AS02[172.29.248.101]
```

It is decided to recover the primary node:

On the secondary node, remove the lost server from the cluster:

cluster del 172.29.248.100

On the secondary node, run cluster provision (it is recommended that this step is run in a terminal opened with the screen command).

On the secondary node, check:

$ cluster status

Data Centre: jhb
             application : AS02[172.29.248.101]

             webproxy : AS02[172.29.248.101]

             database : AS02[172.29.248.101]

Switch on the newly installed server.

On the secondary node, add the server. Run cluster add 172.29.42.100.

On either node, check:

$ cluster status

Data Centre: jhb
             application : AS01[172.29.42.100]
                           AS02[172.29.42.101]

             webproxy :    AS01[172.29.42.100]
                           AS02[172.29.42.101]

             database :    AS01[172.29.42.100]
                           AS02[172.29.42.101]

Configure the primary database. On the newly installed server, run cluster provision primary 172.29.42.100 (it is recommended that this step is run in a terminal opened with the screen command).

Check database configuration on both nodes, for example:

$ database config
    date:
        $date: 1549450382862
    heartbeatIntervalMillis: 2000
    members:
        172.29.42.100:27020:
            priority: 20.0
            stateStr: PRIMARY
            storageEngine: WiredTiger
        172.29.42.100:27030:
            priority: 1.0
            stateStr: ARBITER
            storageEngine: Unknown
        172.29.42.101:27020:
            priority: 10.0
            stateStr: SECONDARY
            storageEngine: WiredTiger
    myState: 1
    ok: 1.0
    set: DEVICEAPI
    term: 8

If an OVA file was not available for your current release and you used the most recent release OVA for which there is an upgrade path to your release to create the new unified node, re-apply the Delta Bundle upgrade to the cluster.

Note that the new node version mismatch in the cluster can be ignored, since this upgrade step aligns the versions.

See: Upgrade

DR Failover and Recovery in a 2 Node Cluster¶

Scenario: Loss of Primary Node¶

Recovery Steps¶

Contents

Previous topic

Next topic

This Page