.. _dr-loss-prim-db-node-modular:

Scenario: Loss of the Primary Database Server in a Modular Cluster
------------------------------------------------------------------------

.. _21.1|VOSS-837:

.. index:: voss;voss finalize_transaction
.. index:: database;database weight
.. index:: database;database primary
.. index:: cluster;cluster run
.. index:: cluster;cluster list
.. index:: cluster;cluster provision
.. index:: database;database config
.. index:: cluster;cluster del
.. index:: web;web weight


* The administrator deployed the cluster into a Primary and DR site.
* The cluster is deployed following the Installation Guide.
* The example is a typical cluster deployment: 8 nodes,
  where 3 nodes are database servers, 3 nodes are application nodes
  and 2 nodes are proxy servers.

  The design is preferably split over 2 physical data centers.  


Database Node Failure
........................


* Normal operations continue where the cluster is processing requests and 
  transactions are committed successfully up to the point where a loss of a 
  primary database server is experienced. 
  In this scenario ``DB01[172.29.42.103]`` failed while transactions were running.
* Examine the cluster status running **cluster status** to determine the failed state:

  ::
  
      Data Centre: unknown

                  database : unknown_172.29.42.103[172.29.42.103] (not responding)


      Data Centre: jhb
                    application : AS01[172.29.42.100]
                                  AS02[172.29.42.101] 

                    webproxy :    PS01[172.29.42.102]

                    database :    DB02[172.29.42.104]

      Data Centre: cpt
                    application : AS03[172.29.21.100]

                    webproxy :   PS02[172.29.21.102]

                    database :   DB03[172.29.21.101]
  

* Some downtime occurs. This can be take up to 15 minutes. To speed up
  recovery, restart the services: **cluster run all app start**.
* The loss of the primary database server will cause an election 
  and the database node with the highest weighting still running will become primary. 
* Check the weights set in the cluster configuration: **database weight list**

::

  [ new example for modular needed ]
    platform@AS01:~$ database weight list
        172.29.21.101: 
            weight: 20
        172.29.42.103: 
            weight: 50
        172.29.42.104: 
            weight: 40

* The primary database node ``172.29.42.103`` failed and therefore 
  node ``172.29.42.104`` will become the primary database node after election.
* To find the primary database, run **database primary**.

::

   platform@AS02:~$ database primary
   172.29.42.104


* At this point *all* transactions that are currently in flight are lost and will not recover.
* The lost transactions have to be replayed or rerun. 

  Bulk load transactions cannot be replayed and have to be rerun.
  Before resubmitting a failed Bulk load job, carry out the following command
  on the primary node CLI in order to manually clear each failure 
  transaction that still has a Processing status *after a service restart*. 
  Use the command: 
     
  **voss finalize_transaction <Trans ID>**
    
  The failed transaction status then changes from Processing to Fail.
  
* With the database server ``DB01[172.29.42.103]`` still down, replaying the failed transactions is successful.


Recovery Steps
................

If the server that is lost, is unrecoverable:

Generally, **cluster provision** must be run every time a node is deleted or added,
even if it is a replacement node. It is recommended that this step is run in a
terminal opened with the **screen** command.

1. Delete its database weight (**database weight del <ip>**), 
   in other words **database weight del 172.29.42.103**
#. Run **cluster del 172.29.42.103**, because this server no longer exists. 
   Power off the deleted node, or disable its Network Interface Card.
#. *Only* run **cluster provision primary 172.29.42.104** from the current primary database node if **database config** shows *no* primary.
   Else only run **cluster provision** from the current primary node. 
   It is recommended that this step is run in a terminal opened with the **screen** command.
  
   This server should already have the highest weight, and its database weight 
   can be checked with **database weight list**

   If all the database weights are deleted and provisioning is run again 
   with **cluster provision**, the CLI message is:
   
   'Please select which of the database should be used as the remaining primary 
   by running "database config", selecting a node to sync 
   from (any node that says primary or secondary and is in a good state,
   i.e. not in a 'RECOVERING' or 'STARTUP' state ) and rerun provisioning 
   with "cluster provision primary <db server ip from commmand above>"'

#. A new database node needs to be deployed. Ensure the server name, 
   IP information and data centre name is the same as on the server 
   that was lost.
#. Run **cluster provision** on the cluster *without*
   the node to be added.
  
   Create the *new database node* at the *required data center* 
   - see: :ref:`create_a_new_VM_using_the_platform-install_OVA`.
#. Run **cluster prepnode** on *all* servers.
#. Run **cluster add <ip>** from an existing node, with the IP address
   of the new database server to add it to the existing cluster.
#. Check the output of the commands: **cluster list** and **cluster status** 
   from the existing node. If the new node does not show up:
   
   a. Run **cluster del <new node>**
   #. Rerun the add of the node on *another* node, until the node 
      shows up in **cluster list** and **cluster status**. 
   #. Verify that the node shows up from all existing nodes. The recovery process 
      may be time consuming.

#. Delete all database weights in the cluster. On a selected database node, *for each database node IP*,
   run **database weight del <IP>**.
#. Re-add all database weights in the cluster. *On each database node*, for each database node IP,
   run **database weight add <IP> <weight>**, considering the following:

   * *For the new database node*, add a database weight lower than that of the weight of the 
     current primary if this will be a secondary, or higher if this will be the new primary.

   When done, check the database weights - either
   individually for each node, or for the cluster by using the command:

   **cluster run application database weight list** 
   
   Make sure all database nodes show correct weights.

#. Make sure the new node is part of the cluster (run **cluster list**) 
   and run **cluster provision primary 172.29.42.104** *from the current primary* 
   (where ``172.29.42.104`` is an example).
   It is recommended that this step is run in a terminal opened with the **screen** command.

   During the provision process, the role of primary will then be transferred from
   the current primary to the node with the highest weight. The role transfer may take
   a significant amount of time, depending on the database size. 

   During the process, typing **app status** from
   the new primary node will still show the database as ``not provisioned``:

   ::

      mongodb v21.1.1 (2021-05-09 13:36)
         |-arbiter             running
         |-database            running (not provisioned)


   To check the progress of the transfer, the database log can be checked. Type
   **log follow mongodb/mongodb/mongodb.log**. When the transfer is complete,
   an entry will show ``sync done`` as in the example below:

   ::

     2021-05-10T14:09:48.639986+00:00 un1 mongod.27020[129593]: [initial sync-0] initial sync done; took 5821s.

   While the primary role transfer is in progress, the system can be used, but bulk database operations should not
   be carried out, because the sync may fall too far behind to complete.

#. If an OVA file was not available for your current release and you used the most recent release OVA
   for which there is an upgrade path to your release to create the new unified node, *re-apply* the
   Delta Bundle upgrade to the cluster.

   Note that the new node version mismatch in the cluster can be ignored, since this upgrade step
   aligns the versions.


   .. raw:: html
   
      <p>See: <a class="reference internal" href="../install/multinode-upgrade-Delta.html#upgrade">Upgrade</a></p>

   .. raw:: latex
   
      See the "Upgrade" step in the "Upgrade a Multinode Environment with the Delta Bundle" topic of the 
      Upgrade Guide with Delta Bundle.


.. note::

   Upon cluster provision failure at any of the proxy nodes during provisioning, the following steps illustrate
   the cluster provisioning:

   1. Run **database config** and check if nodes are either in STARTUP2 or SECONDARY or PRIMARY
      states with correct arbiter placement.
   2. Login to web proxy on both primary and secondary site and add a web weight using **web weight add <ip>:443 1**
      for all those nodes that you want to provide a web weight of 1 on the respective proxies.
   3. Run **cluster provision** to mitigate the failure.
   4. Run **cluster run all app status** to check if all the services are up and running after cluster provisioning
      completes.

.. note::

   If the existing nodes in the cluster do not see the new incoming cluster after **cluster add**,
   try the following steps:

   1. Run **cluster del <ip>** from the primary node, <ip> being the IP of the new incoming node.
   2. Delete all database weights. Run **database weight del <ip>** from the primary database node, <ip> being the IP
      of the nodes, including the new incoming node.
   3. Log into a non primary database node and run **cluster add <ip>**, <ip> being the IP
      of the new incoming node.
   4. Re-add all database weights. Run **database weight add <ip> <weight>** from the same session, <ip> being
      the IP of the nodes, including the new incoming node.
   5. Use **cluster run database cluster list** to check if all nodes see the new incoming nodes inside the
      cluster.