.. _system-monitoring-config:

System monitoring configuration
-------------------------------

.. _19.1.1|VOSS-489:
.. _19.1.1|VOSS-349:
.. _20.1.1|EKB-4642:
.. _20.1.1|EKB-5144:
.. _21.2|EKB-9363:


Overview 
..........


A ``sysadmin`` administrator can access the **System Monitoring** menu
(menu model: ``data/SystemMonitoringConfig``) to manage:

* A number of alerts that will trigger SNMP traps
* Metrics collection

Default values are set in an instance called **Global**.  


Alerts
......

From the **Alerts** tab on the **Configuration** menu, the following settings and options can be
managed:

* **Database**:

  * **Database Size Threshold (GB)**: Size when alert is enabled. The default size is 200GB.
  * **Database Index Size Threshold (GB)**:  Size when alert is enabled. The default size is 50GB
* **Transactions**:

  * **Transaction Queue Size Threshold**: Count when alert is enabled. The default count is 500.
  * **Maximum time in 'Queued' state (hours)**: Time since last transaction update of a queued transaction
    before an alert is enabled. The default time is 6 hours (maximum:48h, minimum: 1h).
  * **Maximum time in 'Processing' state (hours)**: Time since last transaction update of a processing transaction
    before an alert is enabled. The default time is 6 hours (maximum:48h, minimum: 1h).
  * **Transaction Failures to Alert**: errors for operations on model types.

    The default operation is all **Import** operations on all data models (``data/*``)

For the configured alerts, *platform-level* notifications can be generated upon failure of the transaction
in two ways - using the **notify** command on the platform CLI:

* SNMP traps
* Email message

.. note::
  
   Either or both notification types can be configured.

Refer to the *Warnings and Notifications* and *SNMP Configuration* topics in the Platform Guide for
notification setup, details and examples of the SNMP traps.

.. note::

   It may take up to 2 hours from the time that a transaction is considered hung/stuck to the
   time that an alert is created. Thereafter, there will be at least one alert created within
   every clock hour. The subsequent alerts will not necessarily be fixed 60-minute intervals.


Metrics collection
..................

The **Metric Collection** tab shows:

* A configurable **RIS API data collector interval** which is the time interval that
  the real-time information (RIS) data collector service polls the Unified CM
  to obtain the latest phone registration status information for phone instances stored in
  in the VOSS Automate database.
  
  The value is in seconds and the default interval is: ``43200`` seconds (12 hours)
  Refer to the Best Practices Guide for further information if this interval needs to be modified.

  Note that the **Cisco RIS Data Collector** service needs to be enabled and running
  on the Unified CM publisher.

  The RIS data collector service updates the current registration status and/or IP address
  from the Unified CM Registration Status and IP address at the specified
  interval - for all clusters in the system. 
  The **Phones** list view show **Registration Status** and **IP Address** columns
  containing this data. 

  The status of the collector service can be checked with the the platform CLI
  **app status** command - seen as ``voss-risapi_collector`` in the example console
  output snippet below:

  ::

     platform@VOSS:~$ app status
     selfservice v19.3.2 (2020-04-18 19:27)
        |-node                  running
     voss-deviceapi v19.3.2 (2020-04-18 19:30)
        |-voss-cnf_collector    running
        |-voss-queue            running
        |-voss-risapi_collector running
        |-voss-monitoring       running
        |-voss-wsgi             running
     

     ...

  .. note::

     * There is an **Activate Phone Status Service** check box in ``data/Settings``
       that is selected by default. Real time data collection is available when 
       this check box is selected. Phone status data is then fetched directly 
       from the Unified CM and shown on the **Phones** list view.
       See: :ref:`reference-activate-phone-status-service`.





  There is also a macro function ``get_phone_status`` available to return this Phone data,
  given as input parameters:
   
  * a phone PKID
  * followed by a comma and then exactly one 
    RIS API field name. 
    
    The fields below are for example used in the VOSS Automate Admin Portal list view
    of Phones:
  
    * ``status``
    * ``ip_address``
    * ``cm_node``
  
    To see a full list of available fields, run the macro function without
    RIS API field names or refer to the Cisco RIS API documentation.

    For example:
    
    ::
                                                       
      {{fn.get_phone_status
            5ca2b90bce894e0014d488fb,                  
            status}}                                   
  
    output: "Registered"
  
    


