Table of Contents

ATC Availability Reports

Reports

Type of Alarms

Alarms need to attend all requirements below to affect availability.

Cache

Because of the high amount of data, the query to calculate the availability for long periods can take extremely long to execute. To allow the report and widgets to run faster (and to make it easier to maintain the data consistent), a 'cache' was implemented. These are run daily and monthly and saved in the following database tables

A cron job populates these tables daily and monthly. The queries used to populate this cache is where the calculation are being done. The reports have basic queries to read from the cache tables

#Network Availability region cache */15 * * * * mysql -uroot -pPASSWORD ticketer -h atc-rds-instance-master-1.c6hdidhzxqy5.us-east-1.rds.amazonaws.com < ~/script/availability_cache_scripts/availability_by_region_cache.sql > ~/script/availability_cache_scripts/regioncachelog.txt 2>&1

#Network Availability region and carrier cache */15 * * * * mysql -uroot -PASSWORD ticketer -h atc-rds-instance-master-1.c6hdidhzxqy5.us-east-1.rds.amazonaws.com < ~/script/availability_cache_scripts/availability_by_region_and_carrier_cache.sql > ~/script/availability_cache_scripts/regionandacarriercachelog.txt 2>&1

1 0 1 * * mysql -uroot -pPASSWORD ticketer -h atc-rds-instance-master-1.c6hdidhzxqy5.us-east-1.rds.amazonaws.com < ~/script/availability_cache_scripts/availability_by_region_monthly_cache.sql > ~/script/availability_cache_scripts/regionMonthlycachelog.txt 2>&1

1 0 1 * * mysql -uroot -pPASSWORD ticketer -h atc-rds-instance-master-1.c6hdidhzxqy5.us-east-1.rds.amazonaws.com < ~/script/availability_cache_scripts/availability_by_region_and_carrier_monthly_cache.sql > ~/script/availability_cache_scripts/regionCarrierMonthlycachelog.txt 2>&1

<SET group_concat_max_len=15000; INSERT INTO reporting_manager.availability_by_region_cache SELECT

       DATE_SUB(NOW(), INTERVAL 15 MINUTE) startDate,
          fullJoinedList.network_identifier site,
          fullJoinedList.carrier,
          fullJoinedList.cluster_name region,
          fullJoinedList.vendors vendors,
          fullJoinedList.remoteCount remotes,
          fullJoinedList.elementCount,
          fullJoinedList.alarmDurationSum alarmDuration,
                  IF (alarmDurationSum IS NOT NULL and ((remoteCount * 900) - fullJoinedList.alarmDurationSum > 0),
                      (remoteCount * 900) - fullJoinedList.alarmDurationSum,
                     (remoteCount * 900)) active ,
                 (fullJoinedList.remoteCount * 900) base,
                  IF(fullJoinedList.alarmDurationSum IS NOT NULL,
                      ROUND(((fullJoinedList.remoteCount * 900) - SUM(alarmDurationSum)) * 100 / (fullJoinedList.remoteCount * 900),
                              2),
                      100) AS uptime,
                                  fullJoinedList.techType
      FROM
          (SELECT
              controller.*, SUM(tickets.alarmDuration) alarmDurationSum,
                  IF(controller.technology = 'Opto22',1,
                    IF(
                        COUNT(CASE WHEN (controller.ne_type = 'Node' or controller.ne_type = 'wifi' ) THEN 1 END) > 0,
                        COUNT(CASE WHEN (controller.ne_type = 'Node' or controller.ne_type = 'wifi' ) THEN 1 END),
                        1
                  )
                      )AS remoteCount,
              count(*) elementCount,
              controller.technologyType techType,
              group_concat(distinct controller.technology) vendors,
              group_concat(distinct controller.neId) elementIds,
              IF (SUM(tickets.alarmDuration) IS NOT NULL, ( COUNT(CASE WHEN controller.ne_type = 'Node' THEN 1 END) * 900) - SUM(tickets.alarmDuration),
                  (COUNT(CASE WHEN (controller.ne_type = 'Node') THEN 1 END) * 900)) active,
              (COUNT(CASE WHEN (controller.ne_type = 'Node') THEN 1 END) * 900) base
          FROM
                  (SELECT ne.id neId,
                      parent.id parentId,
                          IF(ne.ne_type = 'Controller', ne.name, parent.name) as pName,
                          IF(ne.ne_type = 'Controller', ne.cluster_name, parent.cluster_name) as cluster_name,
                          IF(ne.ne_type = 'Controller', ne.technology, parent.technology) as technology,
                          'System' carrier,
                          IF(ne.assigned_technology_id = 40, concat(ne.network_identifier,' Opto'), parent.network_identifier) network_identifier,
                      ne.ne_type,
                      tech.product_type as technologyType
                  FROM
              snmp_manager.network_element parent
              LEFT JOIN snmp_manager.network_element ne ON ne.parent_id = parent.id
              LEFT JOIN snmp_manager.technology tech ON tech.id = ne.assigned_technology_id
              left join snmp_manager.site_network_element  sne on sne.network_element_id = parent.id
              left join snmp_manager.network_element_property nep on (nep.site_id = sne.site_network_elements_id and nep.name = 'Asset Status')
                  WHERE
                      (parent.ne_type = 'Controller'
                          OR ne.assigned_technology_id = 40)
                      and ne.show_in_monitor = TRUE and ne.status_id in (1,5) # ATC-996532 - 'On Air' and 'Off-air accepted' states
                      AND parent.technology not in ('SYM','Westell_UDIT','Shark','Eaton','Emerson','Kentrox_RMM','n/a','Exfo','Voltserver','Mikrotik','Opto22')
                      AND ne.equipment_type is null
                      group by ne.id
          ) AS controller
      LEFT JOIN
          (SELECT
  aa.created_date,aa.cleared_date,aa.cleared_reason,
              ne.cluster_name region,
              tech.equipment_manufacturer,
              ne.name neName,
                         SUM(CASE
                      WHEN
                          aa.cleared_date IS NOT NULL
                      THEN
                          (UNIX_TIMESTAMP(LEAST(aa.cleared_date,NOW())) - UNIX_TIMESTAMP(GREATEST(aa.created_date,DATE_SUB(NOW(), INTERVAL 15 MINUTE))))
                      WHEN
                          aa.cleared_date IS NULL
                      THEN
                          (UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(GREATEST(aa.created_date,DATE_SUB(NOW(), INTERVAL 15 MINUTE))))
                  END) alarmDuration,
                  IF(ne.ne_type = 'Controller', ne.id, parent.id) controller_id,
              ne.ne_type tNeType,ne.id tNeId, ne.name,
            gts.trap_name, gts.alarm_identifier
          FROM
              snmp_manager.active_alarm aa
          LEFT JOIN  snmp_manager.general_trap_summary gts on aa.creatinggts_id = gts.id
          LEFT JOIN snmp_manager.network_element ne on ne.id = aa.network_element_id
          LEFT JOIN snmp_manager.technology tech ON tech.id = ne.assigned_technology_id
          LEFT JOIN snmp_manager.network_element parent ON (ne.parent_id = parent.id
              AND parent.ne_type = 'Controller')
          LEFT JOIN snmp_manager.site_network_element site on (ne.id = site.network_element_id or parent.id = site.network_element_id )
          LEFT JOIN                    #Needed an alias table as Null MaintenanceAlarmIdentifier value could either mean 'all alarms' or no maintenance rule  at all
            (select maintenance_rule.* ,COALESCE(alarm_identifiers_string,'ALL ALARMS') alarm_identifiers_string
                      from
                          snmp_manager.maintenance_rule
                          LEFT JOIN snmp_manager.maintenance_rule_alarm_identifiers on maintenance_rule_alarm_identifiers.maintenance_rule_id = maintenance_rule.id
                                  where
                                     cancelled_manually is false  )
              AS m ON
          (ne.id = m.network_element_id or site.site_network_elements_id = m.site_id or m.network_element_id = ne.parent_id)
          and aa.created_date BETWEEN m.start_date and m.finish_date
          and (m.alarm_identifiers_string = aa.alarm_identifier or m.alarm_identifiers_string = 'ALL ALARMS')
          LEFT JOIN snmp_manager.site_network_element  sne on sne.network_element_id = parent.id
          LEFT JOIN snmp_manager.network_element_property nep on (nep.site_id = sne.site_network_elements_id and nep.name = 'Asset Status') -- This is for the site status on polaris
          WHERE
                  parent.technology not in ('SYM','Westell_UDIT','Shark','Eaton','Emerson','Kentrox_RMM','n/a','Exfo','Voltserver','Mikrotik','Opto22')
                  AND (aa.status = 'CRITICAL' or aa.previous_status = 'CRITICAL')
                  AND (aa.cleared_date IS NULL OR aa.cleared_date >  DATE_SUB(NOW(), INTERVAL 15 MINUTE) )
                  AND aa.created_date < NOW()
                  AND m.id is null
                  AND ne.ne_type in ('node','host','wifi')
                  AND (nep.value = 'Active' or nep.value is null )
                  AND ne.show_in_monitor = TRUE
                  GROUP BY ne.id
                  ) AS tickets
                  ON tickets.tNeId = controller.neId
         group by controller.network_identifier,controller.carrier,technologyType
        ) as fullJoinedList
          GROUP BY fullJoinedList.network_identifier,fullJoinedList.carrier,technologyType

having region <> 'Admin'

      ORDER BY uptime;>

Calculation formula

( (Count of Remotes * TotalTime) - Duration of Critical Alarms) / (Count of Remotes * TotalTime) ) * 100

The above is run for each site, than it is run for the sum of all remotes in the system to calculate the overall system availability

Schedules

One for each carrier

South America