SNMP Trap: Excessive Load ------------------------- Identification * The originating IP / hostname is used to identify the system generating the traps * The NMS is responsible for associating traps with each managed system, along with clearing of alarms and escalation to the relevant system operator * The trap OID is generic for various SNMP events monitored by the system * The SNMP system name is included as part of the variable binding to assist identification: :: .iso.org.dod.internet.mgmt.mib-2.system.sysName.0 = standalone * The following variable binding can be used to determine that the load average threshold has been exceeded. :: .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotTrigger.0 = ERROR: Excessive load. * The following variable binding can be used to further diagnose which time interval threshold has been exceeded :: .iso.org.dod.internet.private.enterprises.ucdavis.laTable.laEntry.laNames. = .iso.org.dod.internet.private.enterprises.ucdavis.laTable.laEntry.laErrMessage. = +-----------------------+-----------+-------------+---------------------------------------+ | Load average interval | | | | +=======================+===========+=============+=======================================+ | 1 minute | 1 | Load-1 | 1 min Load Average too high (= 2.52) | +-----------------------+-----------+-------------+---------------------------------------+ | 5 minute | 2 | Load-5 | 5 min Load Average too high (= 1.27) | +-----------------------+-----------+-------------+---------------------------------------+ | 15 minute | 3 | Load-15 | 15 min Load Average too high (= 1.27) | +-----------------------+-----------+-------------+---------------------------------------+ Trap OID .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotifications.mteTriggerFired Variable Bindings * .iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.0 = 2 minutes (12065) * snmpTrapOID = mteTriggerFired * .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotTrigger.0 = ERROR: Excessive load. * .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotTargetName.0 = * .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotContextName.0 = * .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotOID.0 = laErrorFlag.1 * .iso.org.dod.internet.mgmt.mib-2.dismanEventMIB.dismanEventMIBNotificationPrefix. dismanEventMIBNotificationObjects.mteHotValue.0 = 1 * .iso.org.dod.internet.mgmt.mib-2.system.sysName.0 = standalone * .iso.org.dod.internet.private.enterprises.ucdavis.laTable.laEntry.laNames.1 = Load-1 * .iso.org.dod.internet.private.enterprises.ucdavis.laTable.laEntry.laErrMessage.1 = 1 min Load Average too high (= 1.36) Severity: * Critical: * ERROR: Excessive load * ERROR: Extremely high CPU usage * Urgent: WARNING: High CPU usage Example: Critical ................. :: Mar 19 08:08:34 robot-sl snmptrapd[1234]: 2019-03-19 08:08:34 [UDP: [192.168.100.3]:20997->[192.168.100.25]:162]: #012iso.3.6.1.2.1.1.3.0 = Timeticks: (6797884) 18:52:58.84 #011iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.2.1.88.2.0.1 #011iso.3.6.1.2.1.88.2.1.1.0 = STRING: "ERROR: Excessive load" #011iso.3.6.1.2.1.88.2.1.2.0 = "" #011iso.3.6.1.2.1.88.2.1.3.0 = "" #011iso.3.6.1.2.1.88.2.1.4.0 = OID: iso.3.6.1.4.1.2021.10.1.100.1 #011iso.3.6.1.2.1.88.2.1.5.0 = INTEGER: 1 #011iso.3.6.1.2.1.1.5.0 = STRING: "VOSS" #011iso.3.6.1.4.1.2021.10.1.2.1 = STRING: "Load-1" #011iso.3.6.1.4.1.2021.10.1.101.1 = STRING: "1 min Load Average too high (= 3.45)" Mar 19 08:10:34 robot-sl snmptrapd[1234]: 2019-03-19 08:10:34 [UDP: [192.168.100.3]:49080->[192.168.100.25]:162]: #012iso.3.6.1.2.1.1.3.0 = Timeticks: (6809885) 18:54:58.85 #011iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.2.1.88.2.0.1 #011iso.3.6.1.2.1.88.2.1.1.0 = STRING: "ERROR: Excessive load" #011iso.3.6.1.2.1.88.2.1.2.0 = "" #011iso.3.6.1.2.1.88.2.1.3.0 = "" #011iso.3.6.1.2.1.88.2.1.4.0 = OID: iso.3.6.1.4.1.2021.10.1.100.2 #011iso.3.6.1.2.1.88.2.1.5.0 = INTEGER: 1 #011iso.3.6.1.2.1.1.5.0 = STRING: "VOSS" #011iso.3.6.1.4.1.2021.10.1.2.2 = STRING: "Load-5" #011iso.3.6.1.4.1.2021.10.1.101.2 = STRING: "5 min Load Average too high (= 2.24)" Mar 19 08:11:34 robot-sl snmptrapd[1234]: 2019-03-19 08:11:34 [UDP: [192.168.100.3]:47676->[192.168.100.25]:162]: #012iso.3.6.1.2.1.1.3.0 = Timeticks: (6815886) 18:55:58.86 #011iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.2.1.88.2.0.1 #011iso.3.6.1.2.1.88.2.1.1.0 = STRING: "ERROR: Excessive load" #011iso.3.6.1.2.1.88.2.1.2.0 = "" #011iso.3.6.1.2.1.88.2.1.3.0 = "" #011iso.3.6.1.2.1.88.2.1.4.0 = OID: iso.3.6.1.4.1.2021.10.1.100.3 #011iso.3.6.1.2.1.88.2.1.5.0 = INTEGER: 1 #011iso.3.6.1.2.1.1.5.0 = STRING: "VOSS" #011iso.3.6.1.4.1.2021.10.1.2.3 = STRING: "Load-15" #011iso.3.6.1.4.1.2021.10.1.101.3 = STRING: "15 min Load Average too high (= 1.16)" Example: Critical ................. :: Mar 19 08:12:14 robot-sl snmptrapd[1234]: 2019-03-19 08:12:14 [UDP: [192.168.100.3]:21137->[192.168.100.25]:162]: #012iso.3.6.1.2.1.1.3.0 = Timeticks: (6819828) 18:56:38.28 #011iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.2.1.88.2.0.1 #011iso.3.6.1.2.1.88.2.1.1.0 = STRING: "ERROR: Extremely high CPU usage" #011iso.3.6.1.2.1.88.2.1.3.0 = STRING: "CPU activity: 4.14, 2.78, 1.29" #011iso.3.6.1.2.1.88.2.1.5.0 = INTEGER: 1 #011iso.3.6.1.2.1.1.5.0 = STRING: "VOSS" Example: Urgent ............... :: Mar 20 12:46:04 robot-sl snmptrapd[1214]: 2019-03-20 12:46:04 [UDP: [192.168.100.3]:48439->[192.168.100.25]:162]: #012iso.3.6.1.2.1.1.3.0 = Timeticks: (114032) 0:19:00.32 #011iso.3.6.1.6.3.1.1.4.1.0 = OID: iso.3.6.1.2.1.88.2.0.1 #011iso.3.6.1.2.1.88.2.1.1.0 = STRING: "WARNING: High CPU usage" #011iso.3.6.1.2.1.88.2.1.3.0 = STRING: "CPU activity: 3.41, 2.56, 1.28" #011iso.3.6.1.2.1.88.2.1.5.0 = INTEGER: 1 #011iso.3.6.1.2.1.1.5.0 = STRING: "VOSS"