System Monitoring Configuration

A sysadmin administrator can access the System Monitoring menu (menu model: data/SystemMonitoringConfig) to manage:

  • A number of alerts that will trigger SNMP traps

  • Metrics collection

Default values are set in an instance called Global.

Alerts

From the Alerts tab on the Configuration menu, the following settings and options can be managed:

  • Database:

    • Database Size Threshold (GB): the default size is 200 GB

    • Database Index Size Threshold (GB): the default size is 50 GB

  • Transactions:

    • Transaction Queue Size Threshold: the default size is 500

    • Maximum time in ‘Queued’ state (hours): time since last transaction update. The default time is 6 hours (maximum:48h, minimum: 1h)

    • Maximum time in ‘Processing’ state (hours): time since last transaction update. The default time is 6 hours (maximum:48h, minimum: 1h)

    • Transaction Failures to Alert: errors for operations on model types.

      The default operation is all Import operations on all data models (data/*)

For the configured alerts, platform-level notifications can be generated upon failure of the transaction in two ways - using the notify command on the platform CLI:

  • SNMP traps

  • Email message

Note

Either or both notification types can be configured.

Refer to the Warnings and Notifications and SNMP Configuration topics in the Platform Guide for notification setup, details and examples of the SNMP traps.

Note

It may take up to 2 hours from the time that a transaction is considered hung/stuck to the time that an alert is created. Thereafter, there will be at least one alert created within every clock hour. The subsequent alerts will not necessarily be fixed 60-minute intervals.

Metrics Collection

The Metric Collection tab shows:

  • A configurable RIS API data collector interval which is the time interval that the real-time information (RIS) data collector service polls the Unified CM to obtain the latest phone registration status information for phone instances stored in in the VOSS Automate database.

    The value is in seconds and the default interval is: 43200 seconds (12 hours) Refer to the Best Practices Guide for further information if this interval needs to be modified.

    Note that the Cisco RIS Data Collector service needs to be enabled and running on the Unified CM publisher.

    The RIS data collector service updates the current registration status and/or IP address from the Unified CM Registration Status and IP address at the specified interval - for all clusters in the system. The Phones list view show Registration Status and IP Address columns containing this data.

    The status of the collector service can be checked with the the platform CLI app status command - seen as voss-risapi_collector in the example console output snippet below:

    platform@VOSS:~$ app status
    selfservice v19.3.2 (2020-04-18 19:27)
       |-node                  running
    voss-deviceapi v19.3.2 (2020-04-18 19:30)
       |-voss-cnf_collector    running
       |-voss-queue            running
       |-voss-risapi_collector running
       |-voss-monitoring       running
       |-voss-wsgi             running
    
    
    ...
    

    Note

    • There is an Activate Phone Status Service check box in data/Settings that is selected by default. Real time data collection is available when this check box is selected. Phone status data is then fetched directly from the Unified CM and shown on the Phones list view. See: Activate Phone Status Service.

    There is also a macro function get_phone_status available to return this Phone data, given as input parameters:

    • a phone PKID

    • followed by a comma and then exactly one RIS API field name.

      The fields below are for example used in the VOSS Automate Admin Portal list view of Phones:

      • status

      • ip_address

      • cm_node

      To see a full list of available fields, run the macro function without RIS API field names or refer to the Cisco RIS API documentation.

      For example:

      {{fn.get_phone_status
            5ca2b90bce894e0014d488fb,
            status}}
      

      output: “Registered”