SNMP Trap: Database Failover Status

A trap is generated when one or more nodes are down in a cluster.

Identification

  • The originating IP / hostname is used to identify the system generating the traps

  • The NMS is responsible for associating traps with each managed system, along with clearing of alarms and escalation to the relevant system operator

  • The trap OID is generic for various SNMP events monitored by the VOSS Automate system

  • The SNMP system name is included as part of the variable binding to assist identification:

    .iso.org.dod.internet.mgmt.mib-2.system.sysName.0 = standalone
    

Trap OID

.iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotifications.mteTriggerFired Variable Bindings - db constantly fails over

  • .iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.0 = 2 minutes (12065)

  • snmpTrapOID = mteTriggerFired

  • .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotTrigger.0 = ‘ERROR: The db is failing over constantly within 5 min’

  • .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotValue.0 = 1

  • .iso.org.dod.internet.mgmt.mib-2.system.sysName.0 = standalone

Severity Messages:

  • Info : INFO: The db failover status returned to normal

  • Critical : ERROR: The db is failing over constantly within 5 min

Example: INFO

Notification message from (1, 3, 6, 1, 6, 1, 1):('192.29.22.122', 31127):
Var-binds:
1.3.6.1.2.1.1.3.0 = 34697935
1.3.6.1.6.3.1.1.4.1.0 = 1.3.6.1.2.1.88.2.0.1
1.3.6.1.2.1.88.2.1.1.0 = INFO: The db failover status returned to normal
1.3.6.1.2.1.88.2.1.3.0 = Cluster failover status:  4th last database failover uncured 3099 seconds ago
1.3.6.1.2.1.88.2.1.5.0 = 0
1.3.6.1.2.1.1.5.0 = UN1-192.29.22.122

Example: ERROR

Notification message from (1, 3, 6, 1, 6, 1, 1):('192.29.22.122', 4884):
Var-binds:
1.3.6.1.2.1.1.3.0 = 34615003
1.3.6.1.6.3.1.1.4.1.0 = 1.3.6.1.2.1.88.2.0.1
1.3.6.1.2.1.88.2.1.1.0 = ERROR: The db is failing over constantly within 5 min
1.3.6.1.2.1.88.2.1.3.0 = Cluster failover status:  4th last database failover occured 2320 seconds ago
1.3.6.1.2.1.88.2.1.5.0 = 1
1.3.6.1.2.1.1.5.0 = UN1-192.29.22.122