.. _dr-scenario-loss-of-full-cluster-modular: Scenario: Loss of Full Cluster in a Modular Cluster ------------------------------------------------------------ .. _21.1|VOSS-837: .. index:: voss;voss finalize_transaction .. index:: voss;voss queues .. index:: database;database weight .. index:: database;database config .. index:: cluster;cluster run .. index:: cluster;cluster provision Background ........... * The administrator deployed the cluster into a primary and DR site. * The cluster is deployed following the Installation Guide. * The example is a typical cluster deployment: 8 nodes, where 3 nodes are database servers, 3 nodes are application nodes and 2 nodes are proxy servers. The design is preferably split over 2 physical data centers. * The cluster might also be in two geographically dispersed areas. The cluster has to be installed in two different site names or data center names. Full cluster failure .................... * In this scenario, *all* nodes failed while transactions were running. * At this point, *all* transactions that were in flight are lost and will not recover. * The lost transactions have to be rerun. * The cluster will not be operational and manual intervention is needed to recover. * To recover the cluster, carry out the Recovery Steps. Recovery Steps .............. .. important:: * Prerequisite: a system backup exported to a remote backup location. The backup file on the remote location would typically have a format *.tar.gz*. This recovery procedure will *only* succeed if you have a valid recent backup to restore. * For details, considerations and specific commands at each step below, refer to the "Modular Cluster Multinode Installation" topic in the *Installation Guide*. 1. Ensure all traces of the previous nodes have been removed from the VMware environment. #. Deploy fresh nodes as per the original topology. * Check topologies and hardware requirements in the *Installation Guide*. .. raw:: html Multinode Modular Cluster with Application and Database Nodes .. raw:: html Multinode Modular Cluster Hardware Specification * For new *node type* deployment at the *required data center*, see: :ref:`create_a_new_VM_using_the_platform-install_OVA`. * For the steps below, follow the "Modular Cluster Installation" topics in the *Installation Guide*: .. raw:: html Modular Cluster Multinode Installation 3. Add each node to the cluster by running **cluster prepnode**. #. From the primary database node, add each node to the cluster using the **cluster add ** command. #. On the primary database node, set the database weights for each database node using the **database weight add ** command. #. Restore a backup made from the highest weighted secondary database node in the original cluster. Follow the Import steps here: :ref:`backup-import-to-new-environment`. Note: It is not necessary to run **cluster provision** again on the primary node. This action is included in the backup restore process. #. On the new app nodes, check the number of queues using **voss queues** and if the number is *less than 2*, set the queues to 2 with **voss queues 2**. .. note:: Applications are reconfigured and the ``voss-queue`` process is restarted. #. Ensure all services are up and running: Run **cluster run all app status** to check if all the services are up and running after the restore completes. .. note:: * Upon cluster provision failure at any of the proxy nodes during provisioning, the following steps illustrate the cluster provisioning: 1. Run **database config** and check if nodes are either in STARTUP2 or SECONDARY or PRIMARY states with correct arbiter placement. 2. Login to web proxy on both primary and secondary site and add a web weight using **web weight add :443 1** for all those nodes that you want to provide a web weight of 1 on the respective proxies. 3. Run **cluster provision** to mitigate the failure. 4. Run **cluster run all app status** to check if all the services are up and running after cluster provisioning completes. * If the existing nodes in the cluster do not see the new incoming cluster after **cluster add**, try the following steps: 1. Run **cluster del ** from the primary node, being the IP of the new incoming node. 2. Delete all database weights. Run **database weight del ** from the primary node, being the IP of the nodes, including the new incoming node. 3. Log into any secondary node (non primary unified node) and run **cluster add ** , being the IP of the new incoming node. 4. Re-add all database weights. Run **database weight add ** from the same session, being the IP of the nodes, including the new incoming node. 5. Use **cluster run database cluster list** to check if all nodes see the new incoming nodes inside the cluster.