Appendix C: Alarm Definition#

SNMP allows a northbound monitoring system (NOC) to receive notifications in the form of traps or informs in response to events, threshold violations, whatever the trap definitions in the loaded MIBs are. Notifications are categorised at three levels (INFORM, WARNING, ERROR) and the NBI may be set to report at a certain level or above. The following event notifications are reported by NBI.

In addition to the events listed here, database and API connection as well as service recovery events are listed in the NBI Troubleshooting Guide.

See: NBI SNMP Traps

Event

Entity to monitor

Message

Severity

System Startup

snmpTrapOID

.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB. snmpMIBObjects.snmpTraps.coldStart

Info

System Shutdown

snmpTrapOID

.iso.org.dod.internet.private.enterprises.netSnmp. netSnmpNotificationPrefix.netSnmpNotifications. nsNotifyShutdown

Critical

Service Changes

snmpTrapOID

Process Restart

Info

Service Changes

snmpTrapOID

Process Warning, Process Stop, Process Error

Critical

Disk Status

mteTriggerFired mteHotTrigger.0

DISK ALMOST FULL: Disk <disk> is more than 80 percent full DISK FULL: Disk full

Critical

Disk Status

mteTriggerFired mteHotTrigger.0

DISK STATUS: Disk <disk> is now running below 80 percent

Info

Disk Latency

mteTriggerFired mteHotTrigger.0

ERROR: Disk slow

Critical

Disk Latency

mteTriggerFired mteHotTrigger.0

INFO: The disk latency returned to normal

Info

Event

Entity to monitor

Message

Severity

Health Emails

mteTriggerFired mteHotTrigger.0

ERROR: Trouble sending health email

Minor

Health Emails

mteTriggerFired mteHotTrigger.0

INFO: Health emails is now being sent

Info

Mailbox Status

mteTriggerFired mteHotTrigger.0

INFO: Messages for <server> auto archived as it reached more than 500

Info

Mailbox Status

mteTriggerFired mteHotTrigger.0

INFO: The total local messages for <server> has reached in excess of 200

Warning

Mailbox Status

mteTriggerFired mteHotTrigger.0

INFO: The total local messages for <server> is now under 200

Info

Health Emails

mteTriggerFired mteHotTrigger.0

WARNING: Not all notify levels is configured with an external email address

Minor

Health Emails

mteTriggerFired mteHotTrigger.0

INFO: All notify levels is now configured with an external email address

Info

Large Log Files

mteTriggerFired mteHotTrigger.0

ERROR: Log files larger than 1Gig found in /var/log

Urgent

Large Log Files

mteTriggerFired mteHotTrigger.0

INFO: /var/log rotated

Info

Service Status

mteTriggerFired mteHotTrigger.0

ERROR: Service Failures

Critical

Service Status

mteTriggerFired mteHotTrigger.0

INFO: Services started successfully

Info

Event

Entity to monitor

Message

Severity

Network Status

mteTriggerFired mteHotTrigger.0

ERROR: Network Failures

Critical

Network Status

mteTriggerFired mteHotTrigger.0

INFO: Network failures resolved

Info

Memory Usage

mteTriggerFired mteHotTrigger.0

ERROR: Memory swap error

Critical

Memory Usage

mteTriggerFired mteHotTrigger.0

INFO: Memory usage returned to normal

Info

CPU Usage

mteTriggerFired mteHotTrigger.0

ERROR: Excessive Load

Critical

CPU Usage

mteTriggerFired mteHotTrigger.0

WARNING: High CPU usage

Urgent

CPU Usage

mteTriggerFired mteHotTrigger.0

ERROR: Extremely high CPU usage

Critical

NTP Status

mteTriggerFired mteHotTrigger.0

WARNING: The ntp daemon has stopped on <server>

Critical

NTP Status

mteTriggerFired mteHotTrigger.0

WARNING: The ntp offset exceeds 1 second on <server>

Urgent

NTP Status

mteTriggerFired mteHotTrigger.0

ERROR: No ntp configured for <server>

Urgent

DNS status

mteTriggerFired mteHotTrigger.0

WARNING: No dns configured for <server>

Urgent

Domain Status

mteTriggerFired mteHotTrigger.0

WARNING: No domain configured for <server>

Urgent