.. _DR_Failover_Recovery_2_node:

DR Failover and Recovery in a 2 Node Cluster
--------------------------------------------

.. _19.1.1|VOSS-475:



.. index:: database;database weight
.. index:: cluster;cluster status
.. index:: cluster;cluster provision
.. index:: database;database config
.. index:: cluster;cluster add
.. index:: cluster;cluster del


.. important::
   A 2 node cluster will *not* fail over automatically.

   With only two Unified nodes, with or without Web proxies, there is no High Availability.
   The database on the primary node is read/write, while the database on the secondary is read only.

   Only redundancy is available.

   * If the primary node fails, a manual delete of the primary
     node on the secondary and a cluster provision will be needed.
   * If the secondary node fails, it needs to be replaced.




Scenario: Loss of Primary Node
..............................

* The administrator deployed the 2-node cluster.

  ::

   $ cluster status

   Data Centre: jhb
                application : AS01[172.29.42.100]
                              AS02[172.29.42.101] 
                 
                webproxy :    AS01[172.29.42.100]
                              AS02[172.29.42.101]

                database :    AS01[172.29.42.100]
                              AS02[172.29.42.101]

  Example database weights:

  ::

   $ database weight list
        172.29.42.100:
            weight: 20
        172.29.42.101:
            weight: 10


* Node Failure: in the case where the primary node is lost on the Primary site:

  ::

    $ cluster status

    Data Centre: unknown
                 application : unknown_172.29.248.100[172.29.248.100] (not responding)
    
                 webproxy : unknown_172.29.248.100[172.29.248.100] (not responding)
    
                 database : unknown_172.29.248.100[172.29.248.100] (not responding)
    
    
    Data Centre: jhb
                 application : AS02[172.29.248.101]
    
                 webproxy : AS02[172.29.248.101]
    
                 database : AS02[172.29.248.101]
    

Recovery Steps
..............

The primary node server is lost.

A. It is decided to fail over to the secondary node:

   1. *On the secondary node*, remove the lost server from the cluster:
   
      **cluster del 172.29.248.100**
   
   #. *On the secondary node*, run **cluster provision** (it is recommended that
      this step is run in a terminal opened with the ``tmux`` command). See: :ref:`tmux-command`.
   
      On the secondary node, check:
   
      ::
      
        $ cluster status
   
        Data Centre: jhb
                     application : AS02[172.29.248.101]
        
                     webproxy : AS02[172.29.248.101]
        
                     database : AS02[172.29.248.101]

B. It is decided to recover the primary node:

   1. *On the secondary node*, remove the lost server from the cluster:
   
      **cluster del 172.29.248.100**
   
   #. *On the secondary node*, run **cluster provision** (it is recommended that
      this step is run in a terminal opened with the ``tmux`` command).
   
      On the secondary node, check:
   
      ::
      
        $ cluster status
   
        Data Centre: jhb
                     application : AS02[172.29.248.101]
        
                     webproxy : AS02[172.29.248.101]
        
                     database : AS02[172.29.248.101]
       
   #. Switch on the newly installed server.
   
      *On the secondary node*, add the server. 
      Run **cluster add 172.29.42.100**.
   
      On either node, check:
   
      ::
      
       $ cluster status
   
       Data Centre: jhb
                    application : AS01[172.29.42.100]
                                  AS02[172.29.42.101] 
                     
                    webproxy :    AS01[172.29.42.100]
                                  AS02[172.29.42.101]
   
                    database :    AS01[172.29.42.100]
                                  AS02[172.29.42.101]
   #. Configure the primary database. *On the newly installed server*, run 
      **cluster provision primary 172.29.42.100** (it is recommended that
      this step is run in a terminal opened with the ``tmux`` command).
   
      Check database configuration on both nodes, for example:
   
      ::
   
       $ database config
           date:
               $date: 1549450382862
           heartbeatIntervalMillis: 2000
           members:
               172.29.42.100:27020:
                   priority: 20.0
                   stateStr: PRIMARY
                   storageEngine: WiredTiger
               172.29.42.100:27030:
                   priority: 1.0
                   stateStr: ARBITER
                   storageEngine: Unknown
               172.29.42.101:27020:
                   priority: 10.0
                   stateStr: SECONDARY
                   storageEngine: WiredTiger
           myState: 1
           ok: 1.0
           set: DEVICEAPI
           term: 8




Scenario: Loss of Secondary Node - Replace
...........................................

1. Remove the secondary node:

   ::

      cluster del <secondary node IP>

2. Re-provision the cluster without the removed node:

   ::

      cluster provision

3. Create a new secondary node: see :ref:`create-a-new-vm-using-the-platform-install-ova`

4. On the newly added node, run:

   ::

      cluster prepnode

5. From the primary unified node, run the command below - with the IP address
   of the new unified server to add it to the existing cluster.

   ::

      cluster add <secondary node IP>


6. Re-provision the cluster:

   ::

      cluster provision primary <IP of current primary>


