.. _upgrade_multinode_ISO: .. rst-class:: chapter-with-expand Modular Cluster Topology: Upgrade a Multinode Environment with the ISO and Template ----------------------------------------------------------------------------------- .. index:: cluster;cluster check .. index:: cluster;cluster upgrade .. index:: screen .. index:: database;database convert_drive .. index:: app; app cleanup .. _21.4|VOSS-872: .. _24.1|VOSS-1187: .. important:: * When upgrading from an existing Modular Cluster Topology that was available since VOSS Automate 21.1, use the steps listed here. * Before upgrading to release 24.1, ensure that an additional disk is available for the Insights database. .. raw:: html See: Adding Hard Disk Space and VOSS Automate Hardware Specifications. .. raw:: latex See the Adding Hard Disk Space topic in the Platform Guide and VOSS Automate Hardware Specifications in the Architecture and Hardware Specification Guide. This disk is needed to assign to the ``insights-voss-sync:database`` mount point. See: :ref:`modular-Post-maintenance-mount-insights-disk`. * Before upgrading to release 24.1, ensure that sufficient time is allocated to the maintenance window. This may vary in accordance with your topology, number of devices and subscribers. The information below serves as a guideline VOSS support can be contacted if further guidance is required: * Cluster upgrade: 4h * Template install: 2.5h * For a 500K Data User system (13Mil RESOURCE documents), the expected ``upgrade_db`` step is about 12h. * For a 160K Data User system (2.5Mil RESOURCE documents), the expected ``upgrade_db`` step is about 2.5h. You can follow the progress on the Admin Portal transaction list. * Tasks that are marked **Prior to Maintenance Window** can be completed a few days prior to the scheduled maintenance window so that VOSS support can be contacted if needed and in order to allow for reduce down time. The standard **screen** command should be used where indicated. See: :ref:`screen-command`. .. rubric:: Primary database and application node in a Modular Cluster Topology * Verify the *primary application node* (UN2) with the **cluster primary role application** command run on the node. The output should be `true`, for example: :: platform@UN2:~$ cluster primary role application is_primary: true * Verify the *primary database node* (UN1) with the **cluster primary role database** command run on the node. The output should be `true`, for example: :: platform@UN1:~$ cluster primary role database is_primary: true .. _modular-Prior-to-maintenance-window-Download-Files-and-Check-ISO: Download Files and Check (Prior to Maintenance Window) ....................................................... .. note:: Ensure that the ``.iso`` file is available on *all* nodes. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +==========================================================================================+====================+ | | | | VOSS files: **https://voss.portalshape.com > Downloads > VOSS Automate > XXX > Upgrade** | | | | | | Download ``.iso`` and ``.template`` files, where XXX is the release number. | | | | | | * Transfer the ``.iso`` file to the ``media/`` folder of the all nodes. | | | * Transfer the ``.template`` file to the ``media/`` folder of the primary application | | | node. | | | | | | Two transfer options: | | | | .. raw:: html | | Either using SFTP: | | | | | | |
** | type='checkbox' | | | id='done' | | * **cd media** | name='done' | | | unchecked> | | * **put ** | | | |
| | | | | | | | Or using SCP: | | | | | | | | | * **scp platform@:~/media** | | | | | | * **scp platform@:~/media** | | | | | | | | | Verify that the ``.iso`` image and ``.template`` file copied: | | | | | | | | | * **ls -l media/** | | | | | | | | | Verify that the original ``.sha256`` checksums on the | | | Download site match. | | | | | | | | | * primary database node: **system checksum media/** | | | | | | ``Checksum: `` | | | | | | * primary application node: **system checksum media/** | | | | | | ``Checksum: `` | | | | | | | | +------------------------------------------------------------------------------------------+--------------------+ .. _modular-Prior-to-maintenance-window-Security-Health-Steps: Security and Health Check Steps (Prior to Maintenance Window) ............................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +----------------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +====================================================================================================+====================+ | Verify that the primary database node is the active primary node at the time of upgrade. | | | | | | **database config** | .. raw:: html | | | | | Ensure that the node on which the installation will be initiated has the ``stateStr`` |
| | | | | stateStr: PRIMARY |
| | storageEngine: WiredTiger | | | | | | | | | * **cluster check** - inspect the output of this command for warnings and errors. You | | | can also use **cluster check verbose** to see more details. While warnings will not | | | prevent an upgrade, it is advisable that these be resolved prior to upgrading where | | | possible. Some warnings may be resolved by upgrading. | | | | | | For troubleshooting and resolutions, | | | also refer to the *Health Checks for Cluster Installations Guide* and *Platform Guide*. | | | | | | If there is any sign of the paths below are over 80% full, a clean-up is needed, | | | for example to avoid risk of full logs occurring during upgrade. | | | Clean-up steps are indicated next to the paths: | | | | | | :: | | | | | | / (call support if over 80%) | | | /var/log (run: log purge) | | | /opt/platform (remove any unnecessary files from /media directory) | | | /tmp (reboot) | | | | | | | | | On the Primary Unified Node, verify there are no pending Security Updates on any of the | | | nodes. | | | | | | .. note:: | | | | | | If you run **cluster status** after installing the new version of **cluster check**, any | | | error message regarding a failed command can be ignored. This error message will not show | | | after upgrade. | | | | | +----------------------------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-Schedules-Transactions-Version-Check: Version Check (Maintenance Window) .................................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +-------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +===========================================================================================+====================+ | | | | **Version** | | | | | | Record the current version information. This is required for upgrade troubleshooting. | | | | | | * Log in on the Admin Portal and record the information contained in the menu: | .. raw:: html | | **About > Version** | | | |
| | | | | |
| | | | | | | +-------------------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-Pre-Upgrade-Steps: Pre-Upgrade Steps (Maintenance Window) ............................................... .. tabularcolumns:: |p{13.5cm}|p{4cm}| +-----------------------------------------------------------------------------------------+--------------------+ | As part of the rollback procedure, ensure that | | | a suitable restore point is obtained prior to the start of the | | | activity, as per the guidelines for the infrastructure on which | | | the VOSS Automate platform is deployed. | | | | | | | .. raw:: html | | | | | |
| | | | | |
| | | | | Optional: If a backup is also required - on the primary database node, use the | | | **backup add ** and | | | **backup create ** commands. For details, refer to the *Platform Guide*. | | | | | +-----------------------------------------------------------------------------------------+--------------------+ .. tabularcolumns:: |p{13.5cm}|p{4cm}| +---------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +=============================================================================================+====================+ | After restore point creation and before upgrading: validate system health and check all | | | services, nodes and weights for the cluster: | | | | | | * **cluster run application cluster list** | | | | | | Make sure all application nodes show. | | | | | | | | | | | | | | | | .. raw:: html | | | | | |
| | is normal on the fresh database nodes. | | | before upgrading a cluster. |
| | Example output: | | | | | | :: | | | | | | 172.29.21.240: | | | weight: 80 | | | 172.29.21.241: | | | weight: 70 | | | 172.29.21.243: | | | weight: 60 | | | 172.29.21.244: | | | weight: 50 | | | | | | * Verify the primary node in the primary site and ensure no nodes are in the | | | 'recovering' state (``stateStr`` is not ``RECOVERING``). On the primary node: | | | | | | On the primary application node, verify there are no pending Security Updates on any of the | | | nodes: | | | | | | * **cluster run all security check** | | | | | | | | +---------------------------------------------------------------------------------------------+--------------------+ .. tabularcolumns:: |p{13.5cm}|p{4cm}| +----------------------------------------------------------------------------------------+--------------------+ | The following step is needed if own private certificate and generated SAN certificates | | | are required and the ``web cert gen_csr`` command was run. | | | For details, refer to the Web Certificate Setup Options topic in the Platform Guide. | | | | | | The steps below are needed to check if a CSR private key exists but no associated | | | signed certificate is available. | .. raw:: html | | | | | Request VOSS support to run on the CLI as ``root`` user, the following command: |
| | do openssl x509 -in $LST -text -noout >/dev/null | | | echo $SIGNED |
| | | | | | | | If the ``echo $SIGNED`` command output is blank, back up the ``csr/`` directory with | | | for example the following command: | | | | | | :: | | | | | | mv /opt/platform/apps/nginx/config/csr/ /opt/platform/apps/nginx/config/csrbackup | | | | | +----------------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-upgrade-ISO: Upgrade (Maintenance Window) ............................ .. note:: * By default, the cluster upgrade is carried out in parallel on all nodes and without any backup in order to provide a fast upgrade. * For systems *upgrading to 24.1 from 21.4.0 - 21.4-PB5*: * The VOSS platform maintenance mode will be started automatically when the **cluster upgrade** command is run. This prevents any new occurrences of scheduled transactions, including the 24.1 database syncs associated with **insights sync**. For details on **insights sync**, see the *Insights Analytics* topic in the Platform Guide. * The **cluster maintenance-mode stop** command must however be run manually after the maintenance window of the upgrade: :ref:`modular-maintenance-window-manual-stop`. For details on the VOSS platform maintenance mode, see the *Maintenance Mode* topic in the Platform Guide. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +--------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +======================================================================================+====================+ | It is recommended that the upgrade steps | | | are run in a terminal opened with the **screen** command. | | | | | | Verify that the ISO has been uploaded to the ``media/`` directory on each node. This | | | will speed up the upgrade time. | | | | | | On the primary database node: | | | | | | | | | * **screen** | .. raw:: html | | | | | | | | * **cluster upgrade media/** |
| | | | | |
| | | | | | | | Close **screen**: ``Ctrl-a \`` | | | | | +--------------------------------------------------------------------------------------+--------------------+ All unused docker images except ``selfservice`` and ``voss_ubuntu`` images will be removed from the system at this stage. .. _modular-maintenance-window-Post-Upgrade-Security-Health-Steps: Post-Upgrade and Health Steps (Maintenance Window) ............................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +--------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +======================================================================================+====================+ | On the primary database node, verify the cluster status: | | | | | | * **cluster check** | .. raw:: html | | | | | * If any of the above commands show errors, check for further details to assist |
| | | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ | To remove a mount directory ``media/`` on nodes that may | | | have remained after for example an upgrade, run: | | | | .. raw:: html | | **cluster run all app cleanup** | | | |
| | | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ | Check for needed security updates. On the primary application node, run: | | | | | | * **cluster run all security check** | | | | | | If one or more updates are required for any node, run on the primary application | | | node: | | | | | | | .. raw:: html | | * **cluster run all security update** | | | |
| | | | | |
| | * **cluster run notme system reboot** | | | | | | | | | If node messages: `` failed with timeout`` are displayed, | | | these can be ignored. | | | | | | | | | * **system reboot** | | | | | | | | | Since all services will be stopped, this takes some time. | | | | | | | | +--------------------------------------------------------------------------------------+--------------------+ | If upgrade is successful, the screen session can be closed by typing **exit** in the | | | screen terminal. If errors occurred, keep the screen terminal | | | open for troubleshooting purposes and contact VOSS support. | .. raw:: html | | | | | |
| | | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-Database-Schema-Upgrade: Database Schema Upgrade (Maintenance Window) ............................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +-----------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +===========================================================+====================+ | It is recommended that the upgrade steps | | | are run in a terminal opened with the **screen** command. | | | | | | On the primary application node: | | | | .. raw:: html | | * **screen** | | | |
| | | | | |
| | | | +-----------------------------------------------------------+--------------------+ .. _modular-maintenance-window-Template-Upgrade-ISO: Template Upgrade (Maintenance Window) ..................................... .. tabularcolumns:: |p{13.5cm}|p{4cm}| +---------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +=======================================================================================+====================+ | It is recommended that the upgrade steps | | | are run in a terminal opened with the **screen** command. | | | | | | On the primary application node: | | | | .. raw:: html | | * **screen** | | | |
** | type='checkbox' | | | id='done' | | | name='done' | | | unchecked> | | | | | |
| | | | +---------------------------------------------------------------------------------------+--------------------+ The following message appears: :: Running the DB-query to find the current environment's existing solution deployment config... * Python functions are deployed * System artifacts are imported. .. note:: In order to carry out fewer upgrade steps, the updates of instances of some models are skipped in the cases where: * ``data/CallManager`` instance does not exist as instance in ``data/NetworkDeviceList`` * ``data/CallManager`` instance exists, but ``data/NetworkDeviceList`` is empty * Call Manager AXL Generic Driver and Call Manager Control Center Services match the ``data/CallManager`` IP The template upgrade automatically detects the deployment mode: "Enterprise" or "Provider". A message displays according to the selected deployment type. Check for one of the messages below: :: Importing EnterpriseOverlay.json Importing ProviderOverlay.json ... The template install automatically restarts necessary applications. If a cluster is detected, the installation propagates changes throughout the cluster. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +==========================================================================================+====================+ | | | | Review the output from the **app template** command and confirm that the upgrade message | .. raw:: html | | appears: | | | |
| | | | | |
| | | | | | | | | | | :: | | | | | | Deployment summary of PREVIOUS template solution | | | (i.e. BEFORE upgrade): | | | ------------------------------------------------- | | | | | | | | | Product: [PRODUCT] | | | Version: [PREVIOUS PRODUCT RELEASE] | | | Iteration-version: [PREVIOUS ITERATION] | | | Platform-version: [PREVIOUS PLATFORM VERSION] | | | | | | | | | This is followed by updated product and version details: | | | | | | :: | | | | | | Deployment summary of UPDATED template solution | | | (i.e. current values after installation): | | | ----------------------------------------------- | | | | | | Product: [PRODUCT] | | | Version: [UPDATED PRODUCT RELEASE] | | | Iteration-version: [UPDATED ITERATION] | | | Platform-version: [UPDATED PLATFORM VERSION] | | | | | | | | +------------------------------------------------------------------------------------------+--------------------+ .. tabularcolumns:: |p{13.5cm}|p{4cm}| +-----------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +=============================================================================+====================+ | | | | | .. raw:: html | | | | | * If no errors are indicated, create a restore point. |
| | | | | |
| | | | | As part of the rollback procedure, ensure that | | | a suitable restore point is obtained prior to the start of the | | | activity, as per the guidelines for the infrastructure on which | | | the VOSS Automate platform is deployed. | | +-----------------------------------------------------------------------------+--------------------+ | | | | | .. raw:: html | | | | | For an unsupported upgrade path, the install script stops with the message: |
| | | | | and see Transaction logs for more detail. |
| | | | | You can roll back as per the guidelines for the infrastructure on which | | | the VOSS Automate platform is deployed. | | | | | +-----------------------------------------------------------------------------+--------------------+ | | | | | .. raw:: html | | If there are errors for another reason, the install script stops with a | | | failure message listing the problem. Contact VOSS support. |
| | | | | |
| | | | +-----------------------------------------------------------------------------+--------------------+ | | | | On the primary application node, | .. raw:: html | | verify the ``extra_functions`` have the *same checksum* across the cluster. | | | |
| | | | | |
| | | | +-----------------------------------------------------------------------------+--------------------+ | Post upgrade migrations: | | | | | | On a single application node of a cluster, run: | .. raw:: html | | | | | * **voss post-upgrade-migrations** |
| | | | | |
| | | | +-----------------------------------------------------------------------------+--------------------+ Data migrations that are not critical to system operation can have significant execution time at scale. These need to be performed after the primary upgrade, allowing the migration to proceed whilst the system is in use - thereby limiting upgrade windows. A transaction is queued on VOSS Automate and its progress is displayed as it executes. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +----------------------------------+--------------------+ | Description and Steps | Notes and Status | +==================================+====================+ | | | | Check cluster status and health | .. raw:: html | | - on the primary database node: | | | | | | * **cluster status** |
| | | | | |
| | | | +----------------------------------+--------------------+ .. _modular-maintenance-window-post-template-upgrade-tasks: Post Template Upgrade Tasks (Maintenance Window) ................................................ .. tabularcolumns:: |p{13.5cm}|p{4cm}| +--------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +======================================================================================+====================+ | | | | **Verify the upgrade** | .. raw:: html | | | | | Log in on the Admin Portal and check the information contained in the | | | **About > Version** menu. Confirm that versions have upgraded. |
| | where ``XXX`` corresponds with the release number of the upgrade. | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ | | .. raw:: html | | | | | |
| | | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ | | .. raw:: html | | | | | |
| | | | | |
| | | | +--------------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-Log-Files-Error-Checks: Log Files and Error Checks (Maintenance Window) ................................................ .. tabularcolumns:: |p{13.5cm}|p{4cm}| +----------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +==================================================================================+====================+ | Inspect the output of the command line interface for upgrade errors, | .. raw:: html | | for example ``File import failed!`` or ``Failed to execute command``. | | | | | | On the primary application node, | | | use the **log view** command to view any log files indicated in the error | | | messages, for example, run the command if the following message appears: |
| | 'log view platform/execute.log' | | | directory to an SFTP server: |
| | | | | * **log send sftp://x.x.x.x install** | | | | | +----------------------------------------------------------------------------------+--------------------+ | | .. raw:: html | | Log in on the Admin Portal as system level administrator, go to | | | **Administration Tools > Transaction** and inspect the transactions |
| | | | | |
| | | | +----------------------------------------------------------------------------------+--------------------+ .. _modular-maintenance-window-manual-stop: Manually Stop the Maintenance Window ................................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +----------------------------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +================================================================================================================+====================+ | On the CLI: | | | | | | Run the **cluster maintenance-mode stop** command to end the automatic start of the VOSS maintenance mode when | .. raw:: html | | the **cluster upgrade** command was run when upgrading to 24.1 from 21.4.0 - 21.4-PB5. | | | |
| | | | | |
| | | | +----------------------------------------------------------------------------------------------------------------+--------------------+ .. _modular-Post-maintenance-window-Licensing: Licensing (outside, after Maintenance Window) ................................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +-------------------------------------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +===================================================================================================================+====================+ | From release 21.4 onwards, the deployment needs to be licensed. | .. raw:: html | | After installation, a 7-day grace period is available to license the product. | | | Since license processing is only scheduled every hour, |
| | | | | |
| | | | +-------------------------------------------------------------------------------------------------------------------+--------------------+ .. _modular-Post-maintenance-mount-insights-disk: Mount the Insights disk (outside, after Maintenance Window) ............................................................. .. tabularcolumns:: |p{13.5cm}|p{4cm}| +------------------------------------------------------------------------------------+--------------------+ | Description and Steps | Notes and Status | +====================================================================================+====================+ | *On each database node*, assign the ``insights-voss-sync:database`` mount point to | | | the drive added for the Insights database prior to upgrade. | | | | .. raw:: html | | For example, if ``drives list`` shows the added disk as: | | | |
| | | | | ``drives add sde insights-voss-sync:database`` |
| | | | | on each unified node where the drive has been added. | | | | | | Sample output (the message below can be ignored on release 24.1: | | | | | | ``WARNING: Failed to connect to lvmetad. Falling back to device scanning.``) | | | | | | :: | | | | | | $ drives add sde insights-voss-sync:database | | | Configuration setting "devices/scan_lvs" unknown. | | | Configuration setting "devices/allow_mixed_block_sizes" unknown. | | | WARNING: Failed to connect to lvmetad. Falling back to device scanning. | | | 71ad98e0-7622-49ad-9fg9-db04055e82bc | | | Application insights-voss-sync processes stopped. | | | Migrating data to new drive - this can take several minutes | | | Data migration complete - reassigning drive | | | Checking that /dev/sde1 is mounted | | | Checking that /dev/dm-0 is mounted | | | /opt/platform/apps/mongodb/dbroot | | | Checking that /dev/sdc1 is mounted | | | /backups | | | | | | Application services:firewall processes stopped. | | | Reconfiguring applications... | | | Application insights-voss-sync processes started. | | | | | +------------------------------------------------------------------------------------+--------------------+