Check general cluster health#
Services#
If any services on the cluster are not running, it could indicate a problem in the system.
To check:
Log in on any unified node (multinode unified topology) / application node (modular cluster topology).
Run the following commands:
cluster statusand
cluster run all app statusCheck for any anomalous output, for example, topped services or unknown nodes or mismatched service versions.
Resolve issues:
Start stopped services.
Resolve issues on non-responsive nodes.
Escalate unresolvable issues to VOSS L2 helpdesk.
Nodes in cluster#
If all nodes in the cluster are not known to all other nodes, provisioning may fail.
Log in on any unified node (multinode unified topology) / application node (modular cluster topology).
Run the following command:
cluster run database cluster listEnsure all nodes list the correct number of nodes.
Resolve issues, if any:
If one or more nodes do not list all nodes, the nodes may need to be deleted and re-added, possibly from a different unified node. Add or delete nodes until all nodes show the same output of the
cluster listcommand.Escalate unresolvable issues to VOSS L2 helpdesk.
Node communication#
Ensure the nodes in the cluster can freely communicate.
Log in on any unified node (multinode unified topology) / application node (modular cluster topology).
Run a cluster command across all nodes, for example:
cluster run all network listVerify that all nodes respond with the expected output.
To resolve issues, check the general health of the cluster.
NTP connectivity#
Ensure NTP is accessible in order to prevent failures such as unexpected session timeout.
For each node:
Log in as root.
Run the following command:
ntpq -pThe output displays a result for the reach metric. A value of 377 indicates that there has been no packet loss, while a value less than 377 shows that there was some packet loss. A value of zero will need to be resolved.
Resolve issues:
If the reach parameter returns with a value of zero (0), restart the time service using the following command:
app start services:time --forceRepeat the procedure. If the problem persists, contact VOSS L2 Helpdesk.