Cluster Check#

On a cluster, the cluster check [verbose] command is available to check:

  • network: test and validate connectivity from each node to every other node, for each port required, as well as the time taken to connect to each node.

    • Checks for access to port 27020 on database hosts is not required from web proxy nodes.

    • Checks for access to port 443 is only required from web proxy nodes to unified nodes.

  • database: carry out a check of database configuration

    • info: displays database weights and whether the node state is primary, secondary or arbiter

    • error:

      • if there is no connection to the database IP on a port

      • if the current database weight does not match the configured weight

      • if a node is marked as an arbiter but is not in the list of arbiters

    • warn: if the primary database node does not have the highest weight

  • disk: carry out a drive space percentage check

  • ntp: at NTP is functioning

  • packages: Check status of packages installed by the system package manager. If an error occurs for a package, a message next to the package name shows: package in an undesired state.

  • security: Check for security updates. Error status:

    • info: zero or one security update missed

    • error: more than one security update missed

  • cluster status: also check the cluster status and

    • info: show status as OK

    • error: display a message to run cluster status for details

    • warn: It is advisable that these be resolved prior to upgrading where possible. Some warnings may be resolved by upgrading.

    Note

    If only node versions mismatch or some nodes are missing components, a warning status is displayed. This status will allow for an upgrade of a node during failover recovery.

    This caters for scenarios during repair/recovery of nodes. The cluster check will warn about version mismatches and not prevent upgrade commands. The cluster check cannot distinguish between whether a recovery process is ongoing or a general fault exists. When no node recovery process is ongoing, then the warning should be treated as an error and resolved before upgrade commences.

This command should also be run before carrying out a system upgrade.

Note

Without the verbose parameter, the cluster check command will only show warnings and errors. Otherwise it would only show the message No issues found with host checks.

Use the verbose parameter to see detailed output.

Example output (abbreviated):

$ cluster check
warn
   192.168.322.3:
       drives
          /: 47 % utilised
   192.168.322.5:
       drives
          /: 47 % utilised
   192.168.322.6:
       drives
          /: 47 % utilised

error
   192.168.322.3
       network
          => 192.168.322.4:27020: Failed
   192.168.322.4: Failed to connect to host
   192.168.322.5
       network
          => 192.168.322.4:27020: Failed
   192.168.322.6
          database
              arbiter: not configured
              weight: mismatched

[...]
   cluster
       status
           Error, please run `cluster status` for more information

Using the verbose parameter to see detailed output Any warnings and errors are then shown at the end of the verbose output.

Abbreviated example, info only; no issues:

$ cluster check verbose
info
    192.168.100.3
        database
            arbiter: ok
            state: ok
            weight: ok
        disk
            /: 28%
            /opt/platform: 27%
            /opt/platform/apps/mongodb/dbroot: 1%
            /tmp: 1%
            /var/log: 3%
        network
            => 192.168.100.4:8443: 0.223ms
            => 192.168.100.4:27020: 0.205ms
            => 192.168.100.5:8443: 0.246ms
            => 192.168.100.5:27020: 0.405ms
            => 192.168.100.6:8443: 0.169ms
            => 192.168.100.6:27020: 0.218ms
            => 192.168.100.7:8443: 0.225ms
            => 192.168.100.8:8443: 0.208ms
        ntp
            172.29.88.56: 18.313ms
        packages
            package database: ok
        security
            updates: 0 missed
    192.168.100.4
        database
            arbiter: ok
            state: ok
            weight: ok
        disk
            /: 28%
            /opt/platform: 27%
            /opt/platform/apps/mongodb/dbroot: 1%
            /tmp: 1%
            /var/log: 2%
        network

    [...]
    cluster
        status
            OK