System Monitoring Configuration#
A sysadmin
administrator can access the System Monitoring menu
(menu model: data/SystemMonitoringConfig
) to manage:
A number of alerts that will trigger SNMP traps
Metrics collection
Default values are set in an instance called Global.
Alerts#
From the Alerts tab on the Configuration menu, the following settings and options can be managed:
Database:
Database Size Threshold (GB): Size when alert is enabled. The default size is 200GB.
Database Index Size Threshold (GB): Size when alert is enabled. The default size is 50GB
Transactions:
Transaction Queue Size Threshold: Count when alert is enabled. The default count is 500.
Maximum time in ‘Queued’ state (hours): Time since last transaction update of a queued transaction before an alert is enabled. The default time is 6 hours (maximum:48h, minimum: 1h).
Maximum time in ‘Processing’ state (hours): Time since last transaction update of a processing transaction before an alert is enabled. The default time is 6 hours (maximum:48h, minimum: 1h).
Transaction Failures to Alert: errors for operations on model types.
The default operation is all Import operations on all data models (
data/*
)
For the configured alerts, platform-level notifications can be generated upon failure of the transaction in two ways - using the notify command on the platform CLI:
SNMP traps
Email message
Note
Either or both notification types can be configured.
Refer to the Warnings and Notifications and SNMP Configuration topics in the Platform Guide for notification setup, details and examples of the SNMP traps.
Note
It may take up to 2 hours from the time that a transaction is considered hung/stuck to the time that an alert is created. Thereafter, there will be at least one alert created within every clock hour. The subsequent alerts will not necessarily be fixed 60-minute intervals.
Metrics Collection#
The Metric Collection tab shows:
A configurable RIS API data collector interval which is the time interval that the real-time information (RIS) data collector service polls the Unified CM to obtain the latest phone registration status information for phone instances stored in in the VOSS Automate database.
The value is in seconds and the default interval is:
43200
seconds (12 hours) Refer to the Best Practices Guide for further information if this interval needs to be modified.Note that the Cisco RIS Data Collector service needs to be enabled and running on the Unified CM publisher.
The RIS data collector service updates the current registration status and/or IP address from the Unified CM Registration Status and IP address at the specified interval - for all clusters in the system. The Phones list view show Registration Status and IP Address columns containing this data.
The status of the collector service can be checked with the the platform CLI app status command - seen as
voss-risapi_collector
in the example console output snippet below:platform@VOSS:~$ app status selfservice v19.3.2 (2020-04-18 19:27) |-node running voss-deviceapi v19.3.2 (2020-04-18 19:30) |-voss-cnf_collector running |-voss-queue running |-voss-risapi_collector running |-voss-monitoring running |-voss-wsgi running ...
Note
There is an Activate Phone Status Service check box in
data/Settings
that is selected by default. Real time data collection is available when this check box is selected. Phone status data is then fetched directly from the Unified CM and shown on the Phones list view. See: Activate Phone Status Service.
There is also a macro function
get_phone_status
available to return this Phone data, given as input parameters:a phone PKID
followed by a comma and then exactly one RIS API field name.
The fields below are for example used in the VOSS Automate Admin Portal list view of Phones:
status
ip_address
cm_node
To see a full list of available fields, run the macro function without RIS API field names or refer to the Cisco RIS API documentation.
For example:
{{fn.get_phone_status 5ca2b90bce894e0014d488fb, status}}
output: “Registered”