User Tools
Writing /app/www/public/data/meta/onboarding/snmpmanager/alarm_-_the_basics.meta failed
onboarding:snmpmanager:alarm_-_the_basics
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| onboarding:snmpmanager:alarm_-_the_basics [2017/02/02 12:16] – mmcc | onboarding:snmpmanager:alarm_-_the_basics [2021/06/25 10:09] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Alarms - The Basics ====== | ||
| + | |||
| + | Author: John Rellis | ||
| + | |||
| + | Alarms are the backbone of the IDMS as they are used to drive ticket creation. The list of current active alarms in the system is available in the SnmpManager Node Monitor. | ||
| + | |||
| + | < | ||
| + | /** | ||
| + | * This represents an uncleared trap in the system and will put a | ||
| + | * network into an alarm until the cleared flag is set. An active alarm | ||
| + | * will reference the trap that created it and the trap that cleared it (if it is cleared) | ||
| + | */ | ||
| + | class ActiveAlarm { | ||
| + | static optionals = [" | ||
| + | static hasMany = [tickets: RemoteTicket] | ||
| + | static transients = [' | ||
| + | |||
| + | //static searchable=true | ||
| + | NetworkElement networkElement | ||
| + | // The gts that created this alarm | ||
| + | GeneralTrapSummary creatingGTS | ||
| + | // The gts that cleared the alarm if the alarm is cleared | ||
| + | GeneralTrapSummary clearingGTS = null | ||
| + | // Boolean to determine if the alarm is cleared or not | ||
| + | boolean cleared = false | ||
| + | // This is an alarm type. It will also need archtecture info | ||
| + | // e.g. type=AIMOS_ALARM: | ||
| + | String type = " | ||
| + | // This is some kind of id that will connect a clearTrap with an original alarm | ||
| + | // e.g. context=AIMOS_PHILADELPHIA: | ||
| + | String context = " | ||
| + | // If the same alarm comes in multiple times for the same network element | ||
| + | // this counter will be set rather than start a new alarm | ||
| + | int repeatsReceived = 0; | ||
| + | // Date alarm is created | ||
| + | Date createdDate = new Date() | ||
| + | // If the alarm has changed status (e.g. repeat received, severity changed or cleared, this will update) | ||
| + | Date statusUpdatedDate = new Date(); | ||
| + | Date clearedDate = null | ||
| + | // This reason is pulled from the clear trap or is set as " | ||
| + | String clearedReason = " | ||
| + | // This is trap specific. This is used for the porposes of clearing so that networkElement-alarmIdentifier is | ||
| + | // a unique key to locate and clear an alarm | ||
| + | String alarmIdentifier = " | ||
| + | // This is usually something like " | ||
| + | String status = " | ||
| + | // This was to store the previous status of the alarm so that after a " | ||
| + | String previousStatus = " | ||
| + | boolean acknowledged = false | ||
| + | Date acknowledgedDate | ||
| + | |||
| + | transient Boolean createNeutralJson = false | ||
| + | transient GeneralTrapSummaryComponent foundGtsComponent = null | ||
| + | |||
| + | .... | ||
| + | |||
| + | } | ||
| + | </ | ||
| + | |||
| + | An understanding of the fields will reveal its behaviour inside the application, | ||
| + | |||
| + | * alarmIdentifier | ||
| + | * This is the main identifier for the alarm for example, "Unit Unavailable" | ||
| + | * status | ||
| + | * This is the current severity of the alarm, this should be one of com.errigal.snmpmanager.trap.AlarmStatus# | ||
| + | * networkElement | ||
| + | * This is the network element that the alarm occurred on, this is typically assigned in the trap rule or the trap rule helper class | ||
| + | * creatingGTS | ||
| + | * This is the GeneralTrapSummary that initiated the creation of this alarm, this is set in the GeneralTrapSummaryWrapper# | ||
| + | * clearingGTS | ||
| + | * This is the GeneralTrapSummary that cleared the alarm, this is set in the GeneralTrapSummaryWrapper# | ||
| + | * context | ||
| + | * This is a field that has less meaning in the current IDMS implementation and was used more in previous releases | ||
| + | * repeatsReceived | ||
| + | * If an activeAlarm is received with the same status and alarmIdentifier on the same networkElement, | ||
| + | * clearedReason | ||
| + | * The reason this alarm cleared, if cleared by a clear trap, this is "Clear Received" | ||
| + | * acknowledged | ||
| + | * If there has been an attempt to create a RemoteTicket with this activeAlarm an alarm will be marked as acknowledged | ||
| + | * createNeutralJson | ||
| + | * If the creatingGTS is found to have come from a Neutral Host installation this is set to true and the affected carriers are included in the NodeMonitor JSON. Note, this is a transient field. | ||
| + | * foundGtsComponent | ||
| + | * If it has been determined during creation of the GeneralTrapSummary that the alarm targets a specific component inside the NetworkElement then it is set here. Note, this is a transient field. | ||
| + | * tickets | ||
| + | * An activeAlarm can generate tickets in the form of RemoteTicket' | ||
| + | | ||
| + | |||
| + | ---- | ||
| + | |||
| + | |||
| + | ===== Scheduling of Alarms ===== | ||
| + | |||
| + | |||
| + | Active Alarm creation is typically triggered from trap rules via the GeneralTrapSummaryWrapper# | ||
| + | |||
| + | This will schedule an alarm and ticket to create after a certain amount of time. | ||
| + | |||
| + | It is possible to have a different time delay for the alarm and the ticket. | ||
| + | |||
| + | A typical set up is to wait 10 seconds to create an alarm and X number of minutes to create a ticket, the X number of minutes is typically decided by the severity of the alarm, the higher the severity, the shorter the time. | ||
| + | |||
| + | This is achieved via the SnmpManager Scheduler. | ||
| + | |||
| + | If a clear is received within these 10 seconds, the alarm is never created and is removed from the Scheduler. | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | |||
| + | ===== Self Assessment ===== | ||
| + | |||
| + | * List all the possible alarm status' | ||
| + | * How are active alarm and tickets scheduled in the SnmpManager? | ||
| + | * If a trap is received indicating a "Unit Unavailable" | ||
| + | * How is the status of the alarm determined? | ||
| + | * If a network element has a CRITICAL "RMS Level Low" alarm active on it, what happens if the alarm is received again with the same severity on the same network element? | ||