User Tools

Site Tools


Writing /app/www/public/data/meta/onboarding/snmpmanager/trap_rule_testing.meta failed
onboarding:snmpmanager:trap_rule_testing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
onboarding:snmpmanager:trap_rule_testing [2016/08/15 18:00] ejoyonboarding:snmpmanager:trap_rule_testing [2023/06/14 16:57] (current) – [IntelliJ IDEA Project] 10.91.110.100
Line 1: Line 1:
 +====== Trap Rule Testing ======
  
 +Author: Eoin Joy
 +
 +Before a Trap Rule is deployed into the production system, first we must test it on the customer-data-specific QA system
 +
 +
 +----
 +
 +
 +===== Configuring An Environment For Script Writing =====
 +
 +It is very useful to be able to have correct syntax highlighting for such scripts that can refer quite closely to core Errigal Application code, as is the case with Trap Rules and the SNMP Manager. For this end you could begin writing your script inline with SNMP Manager code (don't do this), or you could make a new project with a dependency on the SNMP Manager.
 +
 +
 +----
 +
 +
 +==== IntelliJ IDEA Project ====
 +
 +IntelliJ IDEA allows you to add modules to a project to allow for dependencies towards these modules.\\ 
 +It is advised to create a Project in IDEA to contain any trap rule edits you need to make. This project would add the SnmpManager codebase as a module, allowing correct syntax highlighting for Trap Rules, including possible methods and classes to 'import'.\\ Bear in mind that you cannot import into a Trap Rule, but you can fully qualify any such calls.
 +
 +Command to update multiple trap rules on a server.
 +Update one trap rule manually and use the command below to update the others.
 +  cat ruleName.groovy | tee *Core.groovy
 +----
 +
 +
 +===== Working On Edits To Existing Rules =====
 +
 +You should ensure before you begin to work on a trap rule edit, that there is not an edit for that rule in the process of being tested or worked upon.\\ 
 +If this is the case, then either the changes become merged and one person performs and tests both changes at the same time, or the new changes are pushed back until the currently changing version of the rule has been given the go-ahead to be pushed to production.
 +
 +In either case, the rule that should be worked from is gained from the filesystem on the production apps server. Located at ''appfiles/SnmpManagerFiles/rules/<mib_name>/<trap_name>_Core.groovy''\\ 
 +You should determine if there is only one distinct rule text for that folder of rules. If there is not one distinct rule, you must find out if there are minor differences, or entirely different groups of rules. With entirely different groups, consider if they are different because of different usage of the same varbinds, or because the set of varbinds is completely different. If the same varbinds are used in different ways, to ease future maintenance of the rules, there is a possibility and encouragement to merge them into one script.
 +
 +
 +----
 +
 +
 +===== Deploying To QA =====
 +
 +The Trap Rule as a file to be deployed should have the following as its first line of code (comments and whitespace are fine)\\ 
 +<code groovy>com.errigal.snmpmanager.Trap trap -></code> 
 +As you can guess, the script behaves like a closure.\\ 
 +The script does not need to return anything on its final line.\\ 
 +
 +With your new trap rule file, you must deploy on QA to replace the correct files already in place.
 +
 +
 +----
 +
 +
 +==== Using The Trap Rule Creation Script ====
 +
 +On the applications servers in ''~/script/trap_rule_creation/'' there exists a script used to create a suite of trap rule files from one source with the same text and correct naming convention.\\ 
 +Two files are needed to properly run this script, the list of trap names to use (''mobileAccess.txt'') and the contents of the rule (''contents_mobileAccess.txt''). Optionally should some of the trap names given in the source file end with the text ''Clear'', then a separate contents file will be used for any possible changes for dedicated clear rules (''contents_clear_mobileAccess.txt'').\\ 
 +The correct usage in this case of the script would be as follows:
 +<code bash> ./create_rules.csh mobileAccess.txt</code>
 +This creates an archive, ''new_rules.tar.gz'', containing all the Core rules in a directory called ''mobileAccess''.\\ 
 +The next steps to deployment are as follows:
 +  - Move the archive to the trap rules folder<code bash>mv new_rules.tar.gz ~/appfiles/SnmpManagerFiles/rules/</code>
 +  - Extract the new rules from the archive<code bash>tar -zxf new_rules.tar.gz</code>
 +  - Move the existing rules to a backup<code bash>mv ma_events_2_26.mib ma_events_2_26.mib.backup.2016-08-15</code>
 +  - Move the new rules into place<code bash>mv new_rules/mobileAccess ma_events_2_26.mib</code>
 +  - Alter the permissions on these new rules to allow for automatic services like lsyc to manage their synchronisation<code bash>chmod 774 ma_events_2_26.mib/*</code>
 +  - Clean up<code bash>rmdir new_rules
 +rm new_rules.tar.gz</code>
 +
 +At this stage, once the trap rule cache is cleared from the trapRule controller, upon trap receipt, the system will look in the filesystem for the most up-to-date trap rule.
 +
 +
 +----
 +
 +
 +==== lsync ====
 +
 +lsync is a daemon process that manages synchronising files across servers in a cluster upon file creation, modification, or deletion.\\ 
 +**Be aware** that the way that most lsync clusters are set up has the *apps1 server as master to the *apps2 server, but changes from one will be pushed to the other, whereas the relationship is one way towards the *lb1 server containing the distributor. This means that new rules should not be worked on on the distributor, as this can cause confusion in the errigal user should they believe themselves to have made an edit to a rule that has not been edited on the apps servers.\\ 
 +File permissions may also play a part in blocking the propagation of Trap Rules through a cluster. Ensure that rules are owned by ''scotty:scotty'' and have sufficient permissions for unison to perform actions upon. The unison log at ''~/unison.log'' will be helpful in diagnosing these issues.
 +
 +
 +----
 +
 +
 +===== Re-Sending Traps =====
 +
 +To test properly any changes you have made to a rule, you will need to determine that trap processing has not been adversely affected by the changes you have made, and that your changes were effective of course. To do this, we test with recycled trap packets on QA.
 +
 +
 +----
 +
 +
 +==== Manually Re-Sending Traps ====
 +
 +Once you find the trap you wish to re-send (determined below), you can manually re-send the trap from the Trap controller's show page e.g. <nowiki>https://qaerrigallb1.crc/SnmpManager/trap/show/<trap_id></nowiki>
 +
 +In the general case, you will need to set the ip address of the hub that the trap would have calculated as the parent of its network element to be the same as the handler you sent from.
 +
 +Example:
 +  - The trap we decide to resend came from 10.20.30.40
 +  - Find which network element corresponds to that ip address, eg. NE-NY-HUB_001-OPN
 +  - Record the current ip address of this network element
 +  - Determine which handler you are currently using.\\ Use Developer Tools ''Ctrl+Shift+I'' and view the cookies in the Application tab. Result shows SnmpManagerWorker2 implying you are using apps2
 +  - Determine there are currently no network elements using that ip address
 +  - Update the network element to have the ip address of that handler, apps2: 10.40.30.20
 +  - Insert the Load Balancer IP address in the field "Please enter IP address"
 +  - Send your traps
 +  - Reset the ip address of the hub to be its true ip address
 +
 +Please note that the trap controller re-send traps as Version 2 (V2) traps. A few vendors use Version 1 traps (E.g OPTO22), If the trap that you need to test is Version 1 (V1) (Check field 'type' on the trap table') it will not work. If that is the case, you can use iReasoning Mib browser (http://www.ireasoning.com/mibbrowser.shtml) to re-send the trap as V1.
 +----
 +
 +
 +==== The Trap Emulator ====
 +
 +Trap Emulator Documentation can be found at: https://bitbucket.org/errigal/trap-emulator
 +
 +The trap emulator can be used to send one trap at a time or multiple traps if one was to have the right traps appearing consecutively in the database. This is not the recommended course of action in this case
 +
 +
 +----
 +
 +
 +==== What Traps Do I Need To Re-Send? ====
 +
 +To test a single case for a trap rule, you must determine the following
 +  * Received Trap immediately creates a General Trap Summary on the correct Network Element
 +  * Received Trap creates an Active Alarm entry <nowiki><=</nowiki> 10 seconds after processing
 +  * Active Alarm entry appears on the correct element in the Node Monitor
 +  * Repeat traps do not create separate Active Alarms
 +  * Active Alarm entry when acknowledged (manually or automatically) creates a ticket if all of the following are true
 +    - It has Status of CRITICAL, MAJOR, or MINOR
 +    - There exists no unresolved Ticket that has an SNMP Trap Form matching the details of the Active Alarm
 +    - The Network Element it applies to is ON AIR
 +  * A received clear trap with the same alarm identifier and Network Element clears the Active Alarm in the Node Monitor, and moves the Ticket into an Alarm Clear Received state
 +  * A received trap or clear for an element not appearing in the IDMS as a child of an already set up hub should result in an errigalMonitoredCarrierDeviceMissingAlarm on that hub.
 +  * A received trap or clear for an element on a hub that does not appear in the IDMS should create this hub and attempt to create an errigalMonitoredCarrierDeviceMissingAlarm on this new hub.
 +
 +The cases you need to test for a trap rule include testing every type of equipment that a given rule covers. It is also preferred to test every branch of the code in the rule.
 +
 +
 +----
 +
 +
 +===== Determining Success =====
 +
 +During testing, no Exceptions should be triggered by the process of processing any part of the trap.\\ 
 +The database values for the general trap summary, active alarm, and remote ticket should all create correctly and you should be able to follow the breadcrumbs all the way from the trap and its id through GTS, active alarm, remote ticket, ticket, ticket change, and to a form in the ticket like a NOC Form or the SNMP Trap Form.
 +
 +
 +----
 +
 +
 +==== Logging ====
 +
 +During normal execution, the logging is done with the ''log'' variable and is printed to the application logs most often found in the ''~/logs/grails/SnmpManager.log'' file.
 +
 +During Trap Rule execution, the ''log'' variable cannot be used, and as such, all logging is done via print method calls. The output of which can be found in the ''/var/tomcat/SnmpManager/logs/catalina.out'' file.
 +
 +
 +----
 +
 +
 +===== Assessment =====
 +**Ensure you are on QA and NOT production**
 +  - Find the alarm-clear pairs for alarms for a hub in the TMobile - New York cluster for the past week
 +  - Make an edit that will add some useful information to the summary of the general trap summary
 +    - Test this change for several different variations of traps that would be affected by this change
 +    - Restore the old rule and determine it is still working
 +  - Find a rule that does not print all its varbinds. Make an edit to print each varbind name and value upon the start of processing.