User Tools

Site Tools


toolsandtechnologies:airscan

AirScan

AirScan devices are cellular-based network monitoring units deployed at customer sites. They connect to the Errigal platform over a WireGuard VPN tunnel using a cellular modem for backhaul, run an RDF Agent to execute discovery tasks from the orchestrator, and report results back for visualization and alarming.

Each device runs two main applications:

  • AirScan Modem Manager — manages the cellular modem via AT commands, maintains internet connectivity, and auto-reconnects on failure.
  • RDF Agent — polls the RDF Orchestrator for discovery tasks (topology, performance, alarms, etc.), executes them, and pushes results back.

Devices are configured and deployed through a Jenkins pipeline that reads a Google Sheet, generates Ansible inventory, and runs deployment playbooks.


Architecture

Google Sheet (config source)
    │
    ▼
Jenkins Pipeline (airscanautoconfiguration)
    │
    ├── Generates Ansible inventory from sheet
    ├── Configures WireGuard VPN tunnels
    ├── Deploys AirScan Modem Manager
    ├── Registers elements in DB (airscan_load_elements)
    └── Deploys RDF Agent
         │
         ▼
AirScan Device ──WireGuard VPN──► OAT Server ──► Orchestrator
    │                                                ▲
    └── Cellular modem (APN)    heartbeat (SNMP) ────┘

Connectivity Chain

  1. Cellular modem connects to the carrier network via an APN (managed by Modem Manager).
  2. WireGuard tunnel runs from the device through rdflb_server (jump host) to oat_server.
  3. RDF Agent on the device communicates with the orchestrator through the tunnel.
  4. RDF Agent sends SNMP heartbeat traps to SnmpManager, which monitors them and manages the network_element entry.
  5. Orchestrator manages the element entry linked via entry_point_id.

Component Interaction

┌─────────────────────────────────────────────────────┐
│                   AirScan Device                    │
│                                                     │
│  ┌──────────────────┐    ┌────────────────────────┐ │
│  │  Modem Manager   │◄───│      RDF Agent         │ │
│  │  (Flask :5000)   │    │  (Spring Boot :8081)   │ │
│  │                  │    │                        │ │
│  │  AT commands to  │    │  Polls orchestrator    │ │
│  │  cellular modem  │    │  Runs discovery tasks  │ │
│  │  Auto-reconnect  │    │  Sends SNMP heartbeats │ │
│  └───────┬──────────┘    └──────────┬─────────────┘ │
│          │                          │               │
│          ▼                          ▼               │
│  ┌──────────────┐         ┌─────────────────┐       │
│  │ Cellular     │         │ WireGuard VPN   │       │
│  │ Modem (usb0) │         │ Tunnel          │       │
│  └──────┬───────┘         └────────┬────────┘       │
└─────────┼──────────────────────────┼────────────────┘
          │                          │
          ▼                          ▼
   Carrier Network            rdflb_server (jump host)
                                     │
                                     ▼
                               oat_server
                                     │
                              tasks ↓ ↑ results + heartbeats
                                     │
                              ┌──────────────┐
                              │ Orchestrator │
                              │ SnmpManager  │
                              └──────────────┘

Components

AirScan Modem Manager

Repository: errigal/apps/airscanmodemmanager
Language: Python 3.12 / Flask 2.3
Registry: registry.errigal.com/airscan/airscanmodemmanager
Runs on: Port 5000 (host network, privileged container)

The Modem Manager controls the cellular modem on AirScan devices using AT commands over a serial interface. It disables ModemManager and prevents NetworkManager from managing the modem, using pure AT commands for the most reliable carrier connectivity.

How It Works

  • Device discovery: Scans /dev/ttyUSB*, sends AT to each, uses the first responding device.
  • APN configuration: Sets PDP context with AT+CGDCONT=1,“IP”,“{SIM_APN}” and activates with AT+CGACT=1,1.
  • Carrier selection: Auto-select with AT+COPS=0 or specific carrier with AT+COPS=1,2,“{PLMN}”.
  • Auto-reconnect: Background job runs every AUTO_RECONNECT_INTERVAL seconds. Pings PING_TEST_HOST on eth0, wlan0, and the modem interface. If all fail, performs a network scan, band unlock, reconnect, and PDP reconfiguration.
  • Supported modems: Quectel (RG50xQ, RM5xxQ) and Simcom (SIM7500, SIM7600).
  • Band unlock: Simcom modems require AT+CNBP=… for 4G/5G band unlock. Quectel is a no-op.

Environment Variables

Application-level defaults are defined in the Dockerfile and app/app.py.

Deployment-time overrides are set by the Ansible role templates:

Health Check and Recovery

Docker healthcheck: ls /dev/ttyUSB* every 10 seconds — checks modem device is present. An autoheal container automatically restarts the Modem Manager if the healthcheck fails.

Recovery script: A cron job runs every airscanmodemmanager_device_recovery_interval_mins minutes (default 5). It calls http://localhost:5000/status and checks last_connectivity_timestamp. If the device is unreachable for airscanmodemmanager_device_unreachable_interval_hours (default 6) and the last reboot was more than airscanmodemmanager_device_reboot_interval_hours (default 6) ago, the device is rebooted.

Recovery logs are at /var/log/airscanmodemmanager_recovery/airscanmodemmanager_recovery.log (10MB rotation, 5 files).

Source code: bitbucket.org/errigal/airscanmodemmanager


RDF Agent

Repository: errigal/apps/rdf_agent
Language: Java 17 / Spring Boot 3.3
Registry: registry.errigal.com/rdf_agent
Runs on: Port 8081 (bound to 127.0.0.1), management on port 8080 (Actuator/Prometheus)

The RDF Agent polls the RDF Orchestrator for discovery tasks, executes them against target devices, and pushes results back. On AirScan devices it runs in privileged Docker with host networking. It has no inbound API requirement — it only needs outbound connectivity to the orchestrator.

How It Works

  • Task polling: DiscoveryTaskPoller GETs from api/v2/task every POLL_INTERVAL_MS (default 5000ms).
  • Permanent tasks: PermanentTaskPoller GETs from api/v1/permanent/tasks every POLL_FOR_PERMANENT_TASKS_MS (default 60s).
  • Task routing: IncomingRequestProcessor routes tasks to the correct processor based on discovery type and technology.
  • Result submission: OutgoingMessagePusher POSTs results to api/v2/task.
  • Status reporting: StatusReporter POSTs to api/v1/agent/status every 20 seconds with version and hostname.
  • SNMP heartbeat: SnmpTrapListener sends heartbeat traps every HEARTBEAT_INTERVAL_MS (default 60s) to SnmpManager using OID .1.3.6.1.4.1.33582.1.1.2.5.1.

AirScan-Specific Behavior

When IS_AIRSCAN=true, the agent:

  • Talks to the Modem Manager at MODEM_MANAGER_URL (default http://localhost:5000) for cellular metrics, handoff tests, and carrier operations.
  • Collects local metrics from Prometheus Node Exporter (NODE_EXPORTER_URL) via prom2json.
  • Measures bandwidth with vnstat (modem, eth0, wlan0 interfaces).
  • Runs iperf3 speed tests against a configured server.
  • Uses AirScanPerformanceProcessor and CellularProcessor for performance discovery.

Configuration

WireGuard VPN

WireGuard provides the encrypted tunnel from AirScan devices to the platform infrastructure.

Topology

AirScan Device ──► rdflb_server (jump host / WireGuard server) ──► oat_server

IP Calculation

Each device's WireGuard IP is calculated as:

wireguard_ip = {internal_subnet base}.{wireguard_peer + 1}

Example: internal_subnet=10.13.20.0, wireguard_peer=20wireguard_ip=10.13.20.21

SSH Access

AirScan devices are not directly reachable from the corporate network. SSH access goes through rdflb_server as a jump host, then over the WireGuard tunnel to the device's internal IP.

┌──────────────┐          ┌─────────────────────┐          ┌──────────────────┐
│  Your        │   SSH    │   rdflb_server      │   SSH    │  AirScan Device  │
│  Workstation ├─────────►│   (jump host)       ├─────────►│                  │
│              │          │                     │  via WG  │                  │
│              │          │  Public/private IP  │  tunnel  │  WireGuard IP    │
│              │          │  e.g. 10.0.87.50    │          │  e.g. 10.13.20.21│
└──────────────┘          └─────────────────────┘          └──────────────────┘

Manual SSH with -J (ProxyJump):

ssh -J {rdflb_user}@{rdflb_host} {device_user}@{wireguard_ip}
 
# Example:
ssh -J admin@10.0.87.50 root@10.13.20.21

Ansible equivalent (auto-generated in inventory):

The pipeline sets ansible_ssh_common_args with -o ProxyCommand=“ssh -W %h:%p -q {rdflb_user}@{rdflb_host}”, which achieves the same jump transparently for all playbook runs.

Ansible Roles

  • wireguard_server — Runs WireGuard in Docker on rdflb_server, generates peer configs, distributes to clients.
  • wireguard_client — Installs WireGuard on the device, copies peer config, starts the service, writes wireguard_ip back to the Google Sheet.

Google Sheet Configuration

Sheet ID: 1j7rOK5vZhmIj84YJOGzkQ3u4_dUUftE57IbuOh3bnHo
URL: https://docs.google.com/spreadsheets/d/1j7rOK5vZhmIj84YJOGzkQ3u4_dUUftE57IbuOh3bnHo
Service account: scotty@environment-app-versions.iam.gserviceaccount.com

Tab Naming Convention

Tabs are named {customer}/{sheet_name}, where {customer} maps to a folder in env-configuration/. The tab name is used as the Jenkins source parameter.

Current tabs:

  • cts/production
  • qaatc/production
  • prodatc/production
  • qanova/errigal_demo_airscan
  • prodsco/errigal
  • prodsco/shared_access
  • blackbox/airscan

Column Reference

Column Maps To Used In
hostname Ansible inventory hostname Inventory generation
configure If “yes”, host is added to airscan/rdfagent/wireguard_client groups Inventory generation
name_in_platform snmp_manager.network_element.name DB registration
private_ip ansible_host for infrastructure servers Inventory generation
wireguard_ip ansible_host for airscan devices (via ProxyJump) Inventory + DB registration
ssh_user / ssh_pass SSH credentials for the device Inventory generation
wireguard_peer WireGuard peer ID (IP = internal_subnet base + peer + 1) WireGuard config
apn apn_name for cellular APN config Modem Manager deployment
cluster_name Cluster assignment in snmp_manager DB registration
site_name Site assignment in snmp_manager + orchestrator DB registration
iperf3_port Port for iperf3 testing iperf3 config
rdf_agent_version Target RDF Agent version (Docker tag) RDF Agent deployment
airscan_modem_manager_version Target Modem Manager version (Docker tag) Modem Manager deployment

Special Rows

Row Behavior
GLOBAL Non-empty columns become all.vars (e.g. wireguard_network, wireguard_port, internal_subnet)
wireguard_server WireGuard VPN server; uses private_ip as ansible_host
iperf3_server iperf3 test server; uses private_ip as ansible_host
oat_server Constructed from vars_for_airscan.yml (extracted from hosts.ini); wireguard_peer from the sheet

Jenkins Pipeline

Jenkinsfile: airscanautoconfiguration/Jenkinsfile

Parameters

Parameter Default Description
source (job config) Google Sheet tab name, e.g. cts/production
CONFIGURE_WIREGUARD false Configure WireGuard VPN on clients
CONFIGURE_AIRSCAN_MODEM_MANAGER false Deploy AirScan Modem Manager
CONFIGURE_RDF_AGENT false Deploy RDF Agent
CONFIGURE_IPERF3 false Configure iperf3 server

Derived Variables

envVar         = source.split('/')[0]           // e.g. "cts"
invFile        = source.split('/')[1]           // e.g. "production"
worksheet_name = source                         // e.g. "cts/production"
extraVarsLocation = "env-configuration/{envVar}/vars_for_airscan.yml"

envVar determines:

  • Which hosts.ini to use: env-configuration/{envVar}/hosts.ini
  • Which vault password credential: {envVar}_ansible_vault_pass

Pipeline Stages

# Stage Condition Description
1 Preparation Always Clone env-configuration (master) and deployment-playbooks (branch). Set build display name.
2 Build Docker Image Always Build registry.errigal.com/airscanautoconfiguration:{tag} from airscan_config/Dockerfile.
3 Generate Extra Vars Always Run airscan_extract_vars_for_ansible_autoconfig.yml against hosts.ini to produce vars_for_airscan.yml with OAT/RDFLB host, user, password, DB hosts, etc. Uses vault credential.
4 Generate Inventory Always Run google_sheet_to_ansible_inv.py in Docker to read Google Sheet tab and produce {invFile}.yml under env-configuration/{envVar}/.
5 Remove WireGuard from OAT Always Run remove_wireguard_from_oat.yml on oat_server. Sets WIREGUARD_INTERFACE_EXISTS flag.
6 Configure WireGuard on RDFLB WIREGUARD_INTERFACE_EXISTS == true Configure WireGuard on wireguard_server and rdflb_server.
7 Configure WireGuard CONFIGURE_WIREGUARD == true Configure WireGuard for all clients except oat_server.
8 Deploy Modem Manager CONFIGURE_AIRSCAN_MODEM_MANAGER == true Deploy Modem Manager to airscan hosts via airscanmodemmamanger-deploy.yml.
9 Deploy RDF Agent CONFIGURE_RDF_AGENT == true Run airscan_load_elements (DB registration) then deploy RDF Agent via rdf-agent-docker-deploy.yml.
10 Configure iperf3 CONFIGURE_IPERF3 == true Configure iperf3 server via generate_ansible_iperf3_config.yml.

Credentials

  • Vault password: Jenkins credential {envVar}_ansible_vault_pass for Ansible vault decryption.
  • Google API: service_account.json baked into the Docker image (service account scotty@environment-app-versions.iam.gserviceaccount.com).
  • Docker registry: errigal_docker_registry_username / errigal_docker_registry_password for registry.errigal.com.

Inventory Generation (google_sheet_to_ansible_inv.py)

The Python script:

  1. Authenticates with Google Sheets API via service_account.json.
  2. Opens the sheet tab matching SHEET_NAME (e.g. cts/production).
  3. Reads all rows; first row = headers.
  4. GLOBAL row: Non-empty columns become all.vars.
  5. Special rows (wireguard_server, iperf3_server): Use private_ip as ansible_host.
  6. oat_server / rdflb_server: Built from vars_for_airscan.yml (OAT/RDFLB credentials from hosts.ini).
  7. Device rows: Use wireguard_ip as ansible_host with ProxyJump via rdflb_server. If configure == “yes”, add host to airscan, rdfagent, wireguard_client groups.
  8. Writes YAML inventory to {invFile}.yml.

Ansible Roles Reference

airscan_load_elements

Path: deployment-playbooks/roles/airscan_load_elements/
Purpose: Registers AirScan devices in the SNMP Manager and Orchestrator databases.

Runs as part of: rdf-agent-docker-deploy.yml (before RDF Agent deployment, only on airscan hosts).

Database Operations

SNMP Manager (snmp_manager schema):

  1. Check if network_element exists by ip_address
  2. Insert site if missing
  3. Insert network_element (technology=AirScan, ne_type=Controller)
  4. Delete + re-insert site_network_element
  5. Insert expected_heartbeat (15-minute interval)

Orchestrator (orchestrator schema):

  1. Insert customer_site
  2. Insert agent and user_role
  3. Insert or update element (links to SNMP Manager via entry_point_id)
  4. Insert schedule (hourly)
  5. Delete old schedule_config for PERFORMANCE/POLL

API calls:

  1. Login to Orchestrator
  2. Get short install code for the agent
  3. Fetch agent install script to extract rdf_access_token

Variables

Defaults and the full list of variables used by this role are in roles/airscan_load_elements/defaults/main.yml. The SQL operations and variable usage can be seen in roles/airscan_load_elements/tasks/main.yml.

Variables come from two sources:

  • Google Sheet (via inventory) — device IP, name, cluster, site
  • vars_for_airscan.yml (generated from hosts.ini) — DB hosts, orchestrator URL, credentials

airscan_modem_manager

Path: deployment-playbooks/roles/airscan_modem_manager/
Purpose: Deploys the Modem Manager application and configures networking on the device.

Deployment Steps

  1. Stop and disable ModemManager
  2. Configure systemd-networkd for modem interface
  3. Configure NetworkManager to leave usb0 and eth0 unmanaged
  4. Create /opt/services/airscan/
  5. Render docker-compose.yml and .env from templates
  6. Docker login and pull image
  7. docker compose up -d
  8. Wait for port 5000
  9. Verify modem responds: curl http://localhost:5000/modem/at with body AT — expect OK
  10. Optionally write version to Google Sheet
  11. Install recovery cron job
  12. Install vnstat for bandwidth monitoring

Variables

Defaults and the full list of variables are in roles/airscan_modem_manager/defaults/main.yml.


rdf-agent

Path: deployment-playbooks/roles/rdf-agent/
Purpose: Deploys the RDF Agent application on AirScan (and non-AirScan) hosts.

Deployment Steps

  1. Create /opt/services/rdfagent/
  2. Render docker-compose.yml and .env from templates
  3. Docker login and pull image
  4. docker compose up -d
  5. Optionally write version to Google Sheet

AirScan vs Non-AirScan

Aspect AirScan Non-AirScan
Network mode host Bridge (ports 8080, 162/udp)
Privileged Yes No
IS_AIRSCAN true false
Volumes /var/lib/vnstat mounted None
SNMP listener IP wireguard_ip Default

Variables

Defaults and the full list of variables are in roles/rdf-agent/defaults/main.yml.


airscan_write_to_google_sheet

Path: deployment-playbooks/roles/airscan_write_to_google_sheet/
Purpose: Writes deployment results back to the Google Sheet.

Runs the update_google_sheet.py script in a one-off Docker container (airscanautoconfiguration image). Updates a single row by hostname with key-value pairs.

Used by:

  • wireguard_client — writes wireguard_ip
  • airscan_modem_manager — writes airscan_modem_manager_version_actual
  • rdf-agent — writes rdf_agent_version_actual

Database Element Registration

Initial Registration (Ansible)

When the Jenkins pipeline runs with RDF Agent deployment, the airscan_load_elements role performs:

  1. Check by IP: SELECT id FROM snmp_manager.network_element WHERE ip_address = '{wireguard_ip}'
  2. If not found: INSERT new network_element with name = '{name_in_platform}' , ip_address = '{wireguard_ip}' , technology = 'AirScan'
  3. Link to site: DELETE + re-INSERT site_network_element
  4. Add heartbeat: INSERT expected_heartbeat for monitoring (15-minute interval)
  5. Orchestrator: INSERT IGNORE customer_site, agent, element with entry_point_id = {ne_id}

The matching key is IP address (wireguard_ip), not device name. If a network_element already exists for that IP, the INSERT is skipped.

Ongoing Sync (RDFElementSyncJob)

SnmpManager runs a scheduled sync job every 60 seconds:

  1. NetworkElement.afterUpdate() / afterInsert() writes to network_element_change_sync
  2. RDFElementSyncJob reads change records (< 5 days old, with valid IP)
  3. For each change, POSTs to orchestrator /api/v1/element/update
  4. Orchestrator ElementService.updateElement() finds element by entry_point_id
  5. If found: updates IPs, credentials, technology, onAir status
  6. If not found: creates new element

Key code paths:

  • Change trigger: snmpmanager_grails3/…/domain/…/NetworkElement.groovy (afterUpdate, afterInsert, addChangeSyncRecord)
  • Sync service: snmpmanager_grails3/…/services/…/RdfElementSyncService.groovy
  • Orchestrator handler: rdf_orchestrator/…/service/element/ElementService.java

Known Duplicate Element Issue

The orchestrator's unique constraint is on (entry_point_id, customer_site_id) — not just entry_point_id. For AirScan elements, task_processing_agent_override should always be set — this ensures a direct correlation between the agent and the element so that tasks for the element are processed on the correct agent running on that device. When the override is set, customer_site_id is not used for agent routing. This can cause problems:

  • The sync job's findByEntryPointId() does not filter by customer_site_id
  • If multiple elements exist for the same entry_point_id with different customer_site_id, the lookup may return an unpredictable one
  • The Ansible INSERT IGNORE can silently fail if a row already exists with different data
  • If the customer_site_id mapping changes (e.g. site renamed), a new element can be created alongside the old one

Modem Manager API Reference

All endpoints are served on port 5000. Routes are defined using Flask-Classful across three files in the airscanmodemmanager repository:

Route base Source file Description
/ app/app.py Root routes: health check (/), status (/status)
/modem/ app/modem/modemApi.py Modem operations: AT commands, carrier connect, handoff, ICCID, signal info, band unlock
/system/ app/system/systemApi.py System utilities: ping via specific interface

Deployment

Deploying a New AirScan Device

  1. Add device to Google Sheet: Add a row in the appropriate tab (e.g. cts/production) with hostname, configure=yes, wireguard_peer, apn, name_in_platform, cluster_name, site_name, and desired versions.
  2. Run Jenkins pipeline with source matching the sheet tab. Enable all relevant parameters:
    • CONFIGURE_WIREGUARD=true — sets up VPN tunnel
    • CONFIGURE_AIRSCAN_MODEM_MANAGER=true — deploys modem manager
    • CONFIGURE_RDF_AGENT=true — registers DB elements and deploys agent
  3. Verify:
    • WireGuard tunnel is up: sudo wg show on rdflb_server
    • Modem Manager responds: curl http://{wireguard_ip}:5000/modem/at (via tunnel)
    • RDF Agent container running: docker ps | grep rdf on the device
    • Element exists in DB: check snmp_manager.network_element and orchestrator.element

Updating Application Versions

  1. Update rdf_agent_version or airscan_modem_manager_version in the Google Sheet row.
  2. Run Jenkins pipeline with the appropriate parameter enabled (CONFIGURE_RDF_AGENT or CONFIGURE_AIRSCAN_MODEM_MANAGER).
  3. The role pulls the new image, restarts the container, and writes the actual deployed version back to the sheet (rdf_agent_version_actual / airscan_modem_manager_version_actual).

Docker Images

Image Registry Path Build
Modem Manager registry.errigal.com/airscan/airscanmodemmanager:{version} Jenkins (jenkinsCommon), multi-arch (amd64, arm64, arm/v8)
RDF Agent registry.errigal.com/rdf_agent:{version} Drone CI, JAR uploaded to S3
Autoconfiguration registry.errigal.com/airscanautoconfiguration:{tag} Built during pipeline run
Ansible runner registry.errigal.com/ansibledockerimage:latest Pre-built image for running playbooks

Troubleshooting

Device Not Connecting

Check in order:

1. WireGuard Tunnel

SSH to rdflb_server and check if the device's peer is active:

sudo wg show
# Look for the device's peer — check "latest handshake" time

The device's WireGuard IP: {internal_subnet base}.{wireguard_peer + 1}
Example: internal_subnet=10.13.20.0, wireguard_peer=20 → IP 10.13.20.21

2. Google Sheet

Check the sheet tab for the customer:

  • Is the device listed with configure = yes?
  • Is name_in_platform correct and matching the DB?
  • Are wireguard_ip and wireguard_peer correct?

3. Database: snmp_manager.network_element

-- Find the device by IP
SELECT id, name, ip_address, on_air, cluster_name, site_name
FROM snmp_manager.network_element
WHERE ip_address = '{wireguard_ip}';
 
-- Check for duplicates by name
SELECT id, name, ip_address, on_air, cluster_name
FROM snmp_manager.network_element
WHERE name LIKE '%{device_identifier}%';

4. Database: orchestrator.element

-- Find element by entry_point_id (= network_element.id)
SELECT e.id, e.entry_point_id, e.external_ip, e.internal_ip, e.on_air,
       e.customer_site_id, e.task_processing_agent_override
FROM orchestrator.element e
WHERE e.entry_point_id = {ne_id};
 
-- Check for duplicate elements by IP
SELECT e.id, e.entry_point_id, e.external_ip, cs.name AS site_name
FROM orchestrator.element e
JOIN orchestrator.customer_site cs ON e.customer_site_id = cs.id
WHERE e.external_ip = '{wireguard_ip}';

6. Container Status on Device

SSH to the device (via ProxyJump through rdflb_server):

# Check both containers
docker ps
 
# RDF Agent logs
docker logs rdfagent
 
# Modem Manager logs
docker logs airscanmodemmanager

7. Heartbeat Monitoring

SELECT * FROM snmp_manager.expected_heartbeat
WHERE network_element_id = {ne_id};

Device Name Changed — Element Mismatch

Symptom: Customer changed the device name in the platform UI. Device may stop working or show stale data.

What happens:

  1. network_element name changes
  2. afterUpdate() fires, creating a network_element_change_sync record
  3. Sync job pushes updated data to orchestrator
  4. Orchestrator matches by entry_point_id (not name), so the element updates correctly

However, if the pipeline is re-run with a different name_in_platform:

  • The playbook checks by IP address, not name
  • If the IP exists, it skips the INSERT (existing element is reused)
  • The name in network_element is NOT updated by the playbook

If duplicates exist:

-- Find duplicate network_elements
SELECT id, name, ip_address, on_air, cluster_name
FROM snmp_manager.network_element
WHERE technology = 'AirScan'
  AND (name LIKE '%{old_name}%' OR name LIKE '%{new_name}%' OR ip_address = '{wireguard_ip}');
 
-- Find duplicate orchestrator elements
SELECT e.id, e.entry_point_id, e.external_ip, e.on_air, e.customer_site_id,
       cs.name AS site_name
FROM orchestrator.element e
LEFT JOIN orchestrator.customer_site cs ON e.customer_site_id = cs.id
WHERE e.external_ip = '{wireguard_ip}'
   OR e.entry_point_id IN (
       SELECT id FROM snmp_manager.network_element
       WHERE name LIKE '%{old_name}%' OR name LIKE '%{new_name}%'
   );

The correct element should have:

  • entry_point_id matching the snmp_manager.network_element.id for that IP
  • customer_site_id matching the correct site
  • task_processing_agent_override pointing to the correct agent

Update name_in_platform in the Google Sheet to match, then re-run the pipeline if needed.


Common Pipeline Failures

Failure Cause Fix
Vault password error {envVar}_ansible_vault_pass missing in Jenkins Add credential in Jenkins
Sheet access denied Service account lacks access Share sheet with scotty@environment-app-versions.iam.gserviceaccount.com
Tab not found source param doesn't match sheet tab name Verify tab name matches {customer}/{sheet_name} exactly
Missing host groups hosts.ini lacks rdf-orchestrator, rdf-lb, etc. Update env-configuration/{envVar}/hosts.ini
WireGuard timeout Peer unreachable or interface down Check wireguard_server, peer config, firewall
Element INSERT fails Cluster or site doesn't exist in DB Create cluster/site first, or check cluster_name/site_name in sheet
Docker pull fails Registry auth or image not found Check errigal_docker_registry credentials and image tag

Useful Log Locations

Component Location
SnmpManager Application logs — search for RDFElementSync entries
Orchestrator Application logs — search for Updating Element with EntryPointId
Jenkins Build console output — includes debug from inventory generation
WireGuard sudo wg show on wireguard_server or journalctl -u wg-quick@{interface}
Modem Manager docker logs on device, or /var/log/airscanmodemmanager_recovery/ for recovery
RDF Agent docker logs on device, or file at LOGGING_FILE_PATH

Customer Environment Reference

Each customer has:

  • env-configuration/{customer}/hosts.ini — main infrastructure inventory
  • env-configuration/{customer}/group_vars/all/30_all.yml — environment variables
  • Google Sheet tab {customer}/… — dynamic AirScan configuration

The pipeline generates a dynamic inventory at env-configuration/{customer}/{invFile}.yml from the Google Sheet.

CTS Example

Item Value
hosts.ini env-configuration/cts/hosts.ini — defines ctsapps1/2, ctslb1, ctsoat1/2, ctsesk1/2
Google Sheet tab cts/production — dynamic config with EAS-prefixed hostnames
OAT servers ctsoat1 (10.0.87.65), ctsoat2 (10.0.87.115)
DB host cts-master-prod.cl0y2kknu458.us-east-1.rds.amazonaws.com
WireGuard port 51822, network “cts”, subnet 10.13.20.0
Orchestrator URL http://10.13.20.2:8079
toolsandtechnologies/airscan.txt · Last modified: 2026/03/16 13:59 by 10.91.120.100