.. _diagnostic-troubleshoot: Diagnostic Troubleshooting -------------------------- .. index:: diag;diag health .. index:: diag;diag disk .. index:: diag;diag free .. index:: diag;diag top .. index:: app;app status .. index:: log;log view The health displayed on login will normally include sufficient information to determine that the system is either working, or experiencing a fault. More detailed health reports can be be displayed with **diag health**. .. important:: Since the **diag health** command output is paged on the console, you can scroll up or down to see all the output. Type ``q`` at the ``:`` prompt to quit the console pager and output (*not* `Ctrl-C`). :: platform@atlantic:~$ diag health Health summary report for date: Mon Aug 16 12:56:51 UTC 2021 CPU Status: 12:56:51 up 12:13, 1 user, load average: 1.75, 1.72, 1.75 Platform version: platform v21.1.0 (2021-08-15 13:37) Network Status: System name: VOSS Device: ens160 Ip: Netmask: Gateway: 192.168.100.1 Memory Status: total used free shared buff/cache available Mem: 8152816 5820696 392336 164500 1939784 1872452 Swap: 2096124 112640 1983484 Disk Status: Filesystem Size Used Avail Use% Mounted on /dev/sda1 18G 4.7G 13G 28% / /dev/sdb1 9.9G 154M 9.2G 2% /var/log /dev/sdb2 40G 8.4G 30G 23% /opt/platform /dev/sdc1 49G 53M 47G 1% /backups /dev/mapper/voss-dbroot 225G 5.1G 220G 3% /opt/platform/apps/mongodb/dbroot Security Update Status: There are 0 security updates available for the base system. Application Status: selfservice v21.1.0 (2021-08-15 13:36) |-node running voss-deviceapi v21.1.0 (2021-08-15 13:36) |-voss-cnf_collector running |-voss-queue running |-voss-wsgi running |-voss-risapi_collector running |-voss-monitoring running cluster v21.1.0 (2021-08-15 13:36) template_runner v21.1.0 (2021-08-15 13:43) mongodb v21.1.0 (2021-08-15 13:36) |-arbiter running |-database running support v21.1.0 (2021-08-15 13:43) selenium v21.1.0 (2021-08-15 13:42) ... A rich set of SNMP and SMTP traps are described in the Notifications section which can be used to automate fault discovery. Determine if all processes are running using **app status**. If a process is not running, investigate its log file with: **log view process/.** For example, checking processes: :: platform@development:~$ app status development v0.8.0 (2013-08-12 12:41) voss-deviceapi v0.6.0 (2013-11-19 07:37) |-voss-celerycam running |-voss-queue_high_priority running ... core_services v0.8.0 (2013-08-27 10:46) |-wsgi running |-logsizemon running |-firewall running |-mountall running |-syslog running (completed) |-timesync stopped (failed with error 1) nginx v0.8.0 (2013-08-27 10:53) |-nginx running security v0.8.0 (2013-08-27 11:02) Followed by a log investigation for a stopped process: :: platform@development:~$ log view process/core_services.timesync 2013-08-15 10:55:20.234932 is stopping from basic_stop 2013-08-15 10:55:20: core_services:timesync killed successfully 2013-08-15 10:55:20: Apps.StatusGenerator core_services:timesync returned 1 after 1 loops App core_services:timesync is not running with status stopped ... + /usr/sbin/ntpdate 172.29.1.15 2014-02-04 09:27:31: Apps.StatusGenerator core_services:timesync returned 0 after 1 loops 2014-02-04 09:27:31: WaitRunning core_services:timesync is reporting return code 0 core_services:timesync:/opt/platform/apps/core_services/timesync started 4 Feb 09:27:38 ntpdate[2766]: no server suitable for synchronization found + echo 'Failed to contact server: 172.29.1.15 - retrying' Failed to contact server: 172.29.1.15 - retrying + COUNTER=2 + sleep 1 + test 2 -lt 3 + /usr/sbin/ntpdate 172.29.1.15 4 Feb 09:27:48 ntpdate[3197]: no server suitable for synchronization found + echo 'Failed to contact server: 172.29.1.15 - retrying' Failed to contact server: 172.29.1.15 - retrying + COUNTER=3 + sleep 1 + test 3 -lt 3 + test 3 -eq 3 + echo 'Timesync - could not contact server 172.29.1.15 after three tries. Giving up' Timesync - could not contact server 172.29.1.15 after three tries. Giving up + exit 1 The error message and return code being displayed in the browser is also invaluable in determining the cause of the problem. The system resources can be inspected as follows: * **diag disk** will display the disk status * **diag free** and **diag mem** will display the memory status * **diag top** will display the CPU status .. |VOSS Automate| replace:: VOSS Automate .. |Unified CM| replace:: Unified CM