AirScan devices are cellular-based network monitoring units deployed at customer sites. They connect to the Errigal platform over a WireGuard VPN tunnel using a cellular modem for backhaul, run an RDF Agent to execute discovery tasks from the orchestrator, and report results back for visualization and alarming.
Each device runs two main applications:
Devices are configured and deployed through a Jenkins pipeline that reads a Google Sheet, generates Ansible inventory, and runs deployment playbooks.
Google Sheet (config source)
│
▼
Jenkins Pipeline (airscanautoconfiguration)
│
├── Generates Ansible inventory from sheet
├── Configures WireGuard VPN tunnels
├── Deploys AirScan Modem Manager
├── Registers elements in DB (airscan_load_elements)
└── Deploys RDF Agent
│
▼
AirScan Device ──WireGuard VPN──► OAT Server ──► Orchestrator
│ ▲
└── Cellular modem (APN) heartbeat (SNMP) ────┘
rdflb_server (jump host) to oat_server.network_element entry.element entry linked via entry_point_id.┌─────────────────────────────────────────────────────┐
│ AirScan Device │
│ │
│ ┌──────────────────┐ ┌────────────────────────┐ │
│ │ Modem Manager │◄───│ RDF Agent │ │
│ │ (Flask :5000) │ │ (Spring Boot :8081) │ │
│ │ │ │ │ │
│ │ AT commands to │ │ Polls orchestrator │ │
│ │ cellular modem │ │ Runs discovery tasks │ │
│ │ Auto-reconnect │ │ Sends SNMP heartbeats │ │
│ └───────┬──────────┘ └──────────┬─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌─────────────────┐ │
│ │ Cellular │ │ WireGuard VPN │ │
│ │ Modem (usb0) │ │ Tunnel │ │
│ └──────┬───────┘ └────────┬────────┘ │
└─────────┼──────────────────────────┼────────────────┘
│ │
▼ ▼
Carrier Network rdflb_server (jump host)
│
▼
oat_server
│
tasks ↓ ↑ results + heartbeats
│
┌──────────────┐
│ Orchestrator │
│ SnmpManager │
└──────────────┘
Repository: errigal/apps/airscanmodemmanager
Language: Python 3.12 / Flask 2.3
Registry: registry.errigal.com/airscan/airscanmodemmanager
Runs on: Port 5000 (host network, privileged container)
The Modem Manager controls the cellular modem on AirScan devices using AT commands over a serial interface. It disables ModemManager and prevents NetworkManager from managing the modem, using pure AT commands for the most reliable carrier connectivity.
/dev/ttyUSB*, sends AT to each, uses the first responding device.AT+CGDCONT=1,“IP”,“{SIM_APN}” and activates with AT+CGACT=1,1.AT+COPS=0 or specific carrier with AT+COPS=1,2,“{PLMN}”.AUTO_RECONNECT_INTERVAL seconds. Pings PING_TEST_HOST on eth0, wlan0, and the modem interface. If all fail, performs a network scan, band unlock, reconnect, and PDP reconfiguration.AT+CNBP=… for 4G/5G band unlock. Quectel is a no-op.Application-level defaults are defined in the Dockerfile and app/app.py.
Deployment-time overrides are set by the Ansible role templates:
Docker healthcheck: ls /dev/ttyUSB* every 10 seconds — checks modem device is present. An autoheal container automatically restarts the Modem Manager if the healthcheck fails.
Recovery script: A cron job runs every airscanmodemmanager_device_recovery_interval_mins minutes (default 5). It calls http://localhost:5000/status and checks last_connectivity_timestamp. If the device is unreachable for airscanmodemmanager_device_unreachable_interval_hours (default 6) and the last reboot was more than airscanmodemmanager_device_reboot_interval_hours (default 6) ago, the device is rebooted.
Recovery logs are at /var/log/airscanmodemmanager_recovery/airscanmodemmanager_recovery.log (10MB rotation, 5 files).
Source code: bitbucket.org/errigal/airscanmodemmanager
Repository: errigal/apps/rdf_agent
Language: Java 17 / Spring Boot 3.3
Registry: registry.errigal.com/rdf_agent
Runs on: Port 8081 (bound to 127.0.0.1), management on port 8080 (Actuator/Prometheus)
The RDF Agent polls the RDF Orchestrator for discovery tasks, executes them against target devices, and pushes results back. On AirScan devices it runs in privileged Docker with host networking. It has no inbound API requirement — it only needs outbound connectivity to the orchestrator.
DiscoveryTaskPoller GETs from api/v2/task every POLL_INTERVAL_MS (default 5000ms).PermanentTaskPoller GETs from api/v1/permanent/tasks every POLL_FOR_PERMANENT_TASKS_MS (default 60s).IncomingRequestProcessor routes tasks to the correct processor based on discovery type and technology.OutgoingMessagePusher POSTs results to api/v2/task.StatusReporter POSTs to api/v1/agent/status every 20 seconds with version and hostname.SnmpTrapListener sends heartbeat traps every HEARTBEAT_INTERVAL_MS (default 60s) to SnmpManager using OID .1.3.6.1.4.1.33582.1.1.2.5.1.
When IS_AIRSCAN=true, the agent:
MODEM_MANAGER_URL (default http://localhost:5000) for cellular metrics, handoff tests, and carrier operations.NODE_EXPORTER_URL) via prom2json.vnstat (modem, eth0, wlan0 interfaces).iperf3 speed tests against a configured server.AirScanPerformanceProcessor and CellularProcessor for performance discovery.Application-level defaults are in src/main/resources/application.properties.
Deployment-time overrides are set by the Ansible role templates:
Source code: bitbucket.org/errigal/rdf_agent
WireGuard provides the encrypted tunnel from AirScan devices to the platform infrastructure.
AirScan Device ──► rdflb_server (jump host / WireGuard server) ──► oat_server
Each device's WireGuard IP is calculated as:
wireguard_ip = {internal_subnet base}.{wireguard_peer + 1}
Example: internal_subnet=10.13.20.0, wireguard_peer=20 → wireguard_ip=10.13.20.21
AirScan devices are not directly reachable from the corporate network. SSH access goes through rdflb_server as a jump host, then over the WireGuard tunnel to the device's internal IP.
┌──────────────┐ ┌─────────────────────┐ ┌──────────────────┐ │ Your │ SSH │ rdflb_server │ SSH │ AirScan Device │ │ Workstation ├─────────►│ (jump host) ├─────────►│ │ │ │ │ │ via WG │ │ │ │ │ Public/private IP │ tunnel │ WireGuard IP │ │ │ │ e.g. 10.0.87.50 │ │ e.g. 10.13.20.21│ └──────────────┘ └─────────────────────┘ └──────────────────┘
Manual SSH with -J (ProxyJump):
ssh -J {rdflb_user}@{rdflb_host} {device_user}@{wireguard_ip} # Example: ssh -J admin@10.0.87.50 root@10.13.20.21
Ansible equivalent (auto-generated in inventory):
The pipeline sets ansible_ssh_common_args with -o ProxyCommand=“ssh -W %h:%p -q {rdflb_user}@{rdflb_host}”, which achieves the same jump transparently for all playbook runs.
wireguard_server — Runs WireGuard in Docker on rdflb_server, generates peer configs, distributes to clients.wireguard_client — Installs WireGuard on the device, copies peer config, starts the service, writes wireguard_ip back to the Google Sheet.
Sheet ID: 1j7rOK5vZhmIj84YJOGzkQ3u4_dUUftE57IbuOh3bnHo
URL: https://docs.google.com/spreadsheets/d/1j7rOK5vZhmIj84YJOGzkQ3u4_dUUftE57IbuOh3bnHo
Service account: scotty@environment-app-versions.iam.gserviceaccount.com
Tabs are named {customer}/{sheet_name}, where {customer} maps to a folder in env-configuration/. The tab name is used as the Jenkins source parameter.
Current tabs:
cts/productionqaatc/productionprodatc/productionqanova/errigal_demo_airscanprodsco/errigalprodsco/shared_accessblackbox/airscan| Column | Maps To | Used In |
|---|---|---|
hostname | Ansible inventory hostname | Inventory generation |
configure | If “yes”, host is added to airscan/rdfagent/wireguard_client groups | Inventory generation |
name_in_platform | snmp_manager.network_element.name | DB registration |
private_ip | ansible_host for infrastructure servers | Inventory generation |
wireguard_ip | ansible_host for airscan devices (via ProxyJump) | Inventory + DB registration |
ssh_user / ssh_pass | SSH credentials for the device | Inventory generation |
wireguard_peer | WireGuard peer ID (IP = internal_subnet base + peer + 1) | WireGuard config |
apn | apn_name for cellular APN config | Modem Manager deployment |
cluster_name | Cluster assignment in snmp_manager | DB registration |
site_name | Site assignment in snmp_manager + orchestrator | DB registration |
iperf3_port | Port for iperf3 testing | iperf3 config |
rdf_agent_version | Target RDF Agent version (Docker tag) | RDF Agent deployment |
airscan_modem_manager_version | Target Modem Manager version (Docker tag) | Modem Manager deployment |
| Row | Behavior |
|---|---|
| GLOBAL | Non-empty columns become all.vars (e.g. wireguard_network, wireguard_port, internal_subnet) |
| wireguard_server | WireGuard VPN server; uses private_ip as ansible_host |
| iperf3_server | iperf3 test server; uses private_ip as ansible_host |
| oat_server | Constructed from vars_for_airscan.yml (extracted from hosts.ini); wireguard_peer from the sheet |
Jenkinsfile: airscanautoconfiguration/Jenkinsfile
| Parameter | Default | Description |
|---|---|---|
source | (job config) | Google Sheet tab name, e.g. cts/production |
CONFIGURE_WIREGUARD | false | Configure WireGuard VPN on clients |
CONFIGURE_AIRSCAN_MODEM_MANAGER | false | Deploy AirScan Modem Manager |
CONFIGURE_RDF_AGENT | false | Deploy RDF Agent |
CONFIGURE_IPERF3 | false | Configure iperf3 server |
envVar = source.split('/')[0] // e.g. "cts"
invFile = source.split('/')[1] // e.g. "production"
worksheet_name = source // e.g. "cts/production"
extraVarsLocation = "env-configuration/{envVar}/vars_for_airscan.yml"
envVar determines:
hosts.ini to use: env-configuration/{envVar}/hosts.ini{envVar}_ansible_vault_pass| # | Stage | Condition | Description |
|---|---|---|---|
| 1 | Preparation | Always | Clone env-configuration (master) and deployment-playbooks (branch). Set build display name. |
| 2 | Build Docker Image | Always | Build registry.errigal.com/airscanautoconfiguration:{tag} from airscan_config/Dockerfile. |
| 3 | Generate Extra Vars | Always | Run airscan_extract_vars_for_ansible_autoconfig.yml against hosts.ini to produce vars_for_airscan.yml with OAT/RDFLB host, user, password, DB hosts, etc. Uses vault credential. |
| 4 | Generate Inventory | Always | Run google_sheet_to_ansible_inv.py in Docker to read Google Sheet tab and produce {invFile}.yml under env-configuration/{envVar}/. |
| 5 | Remove WireGuard from OAT | Always | Run remove_wireguard_from_oat.yml on oat_server. Sets WIREGUARD_INTERFACE_EXISTS flag. |
| 6 | Configure WireGuard on RDFLB | WIREGUARD_INTERFACE_EXISTS == true | Configure WireGuard on wireguard_server and rdflb_server. |
| 7 | Configure WireGuard | CONFIGURE_WIREGUARD == true | Configure WireGuard for all clients except oat_server. |
| 8 | Deploy Modem Manager | CONFIGURE_AIRSCAN_MODEM_MANAGER == true | Deploy Modem Manager to airscan hosts via airscanmodemmamanger-deploy.yml. |
| 9 | Deploy RDF Agent | CONFIGURE_RDF_AGENT == true | Run airscan_load_elements (DB registration) then deploy RDF Agent via rdf-agent-docker-deploy.yml. |
| 10 | Configure iperf3 | CONFIGURE_IPERF3 == true | Configure iperf3 server via generate_ansible_iperf3_config.yml. |
{envVar}_ansible_vault_pass for Ansible vault decryption.service_account.json baked into the Docker image (service account scotty@environment-app-versions.iam.gserviceaccount.com).errigal_docker_registry_username / errigal_docker_registry_password for registry.errigal.com.The Python script:
service_account.json.SHEET_NAME (e.g. cts/production).all.vars.wireguard_server, iperf3_server): Use private_ip as ansible_host.vars_for_airscan.yml (OAT/RDFLB credentials from hosts.ini).wireguard_ip as ansible_host with ProxyJump via rdflb_server. If configure == “yes”, add host to airscan, rdfagent, wireguard_client groups.{invFile}.yml.
Path: deployment-playbooks/roles/airscan_load_elements/
Purpose: Registers AirScan devices in the SNMP Manager and Orchestrator databases.
Runs as part of: rdf-agent-docker-deploy.yml (before RDF Agent deployment, only on airscan hosts).
SNMP Manager (snmp_manager schema):
network_element exists by ip_addresssite if missingnetwork_element (technology=AirScan, ne_type=Controller)site_network_elementexpected_heartbeat (15-minute interval)
Orchestrator (orchestrator schema):
customer_siteagent and user_roleelement (links to SNMP Manager via entry_point_id)schedule (hourly)schedule_config for PERFORMANCE/POLLAPI calls:
rdf_access_tokenDefaults and the full list of variables used by this role are in roles/airscan_load_elements/defaults/main.yml. The SQL operations and variable usage can be seen in roles/airscan_load_elements/tasks/main.yml.
Variables come from two sources:
vars_for_airscan.yml (generated from hosts.ini) — DB hosts, orchestrator URL, credentials
Path: deployment-playbooks/roles/airscan_modem_manager/
Purpose: Deploys the Modem Manager application and configures networking on the device.
usb0 and eth0 unmanaged/opt/services/airscan/docker-compose.yml and .env from templatesdocker compose up -dcurl http://localhost:5000/modem/at with body AT — expect OKDefaults and the full list of variables are in roles/airscan_modem_manager/defaults/main.yml.
Path: deployment-playbooks/roles/rdf-agent/
Purpose: Deploys the RDF Agent application on AirScan (and non-AirScan) hosts.
/opt/services/rdfagent/docker-compose.yml and .env from templatesdocker compose up -d| Aspect | AirScan | Non-AirScan |
|---|---|---|
| Network mode | host | Bridge (ports 8080, 162/udp) |
| Privileged | Yes | No |
IS_AIRSCAN | true | false |
| Volumes | /var/lib/vnstat mounted | None |
| SNMP listener IP | wireguard_ip | Default |
Defaults and the full list of variables are in roles/rdf-agent/defaults/main.yml.
Path: deployment-playbooks/roles/airscan_write_to_google_sheet/
Purpose: Writes deployment results back to the Google Sheet.
Runs the update_google_sheet.py script in a one-off Docker container (airscanautoconfiguration image). Updates a single row by hostname with key-value pairs.
Used by:
wireguard_client — writes wireguard_ipairscan_modem_manager — writes airscan_modem_manager_version_actualrdf-agent — writes rdf_agent_version_actual
When the Jenkins pipeline runs with RDF Agent deployment, the airscan_load_elements role performs:
SELECT id FROM snmp_manager.network_element WHERE ip_address = '{wireguard_ip}' network_element with name = '{name_in_platform}' , ip_address = '{wireguard_ip}' , technology = 'AirScan' site_network_elementexpected_heartbeat for monitoring (15-minute interval)customer_site, agent, element with entry_point_id = {ne_id}
The matching key is IP address (wireguard_ip), not device name. If a network_element already exists for that IP, the INSERT is skipped.
SnmpManager runs a scheduled sync job every 60 seconds:
NetworkElement.afterUpdate() / afterInsert() writes to network_element_change_syncRDFElementSyncJob reads change records (< 5 days old, with valid IP)/api/v1/element/updateElementService.updateElement() finds element by entry_point_idKey code paths:
snmpmanager_grails3/…/domain/…/NetworkElement.groovy (afterUpdate, afterInsert, addChangeSyncRecord)snmpmanager_grails3/…/services/…/RdfElementSyncService.groovyrdf_orchestrator/…/service/element/ElementService.java
The orchestrator's unique constraint is on (entry_point_id, customer_site_id) — not just entry_point_id. For AirScan elements, task_processing_agent_override should always be set — this ensures a direct correlation between the agent and the element so that tasks for the element are processed on the correct agent running on that device. When the override is set, customer_site_id is not used for agent routing. This can cause problems:
findByEntryPointId() does not filter by customer_site_identry_point_id with different customer_site_id, the lookup may return an unpredictable oneINSERT IGNORE can silently fail if a row already exists with different datacustomer_site_id mapping changes (e.g. site renamed), a new element can be created alongside the old oneAll endpoints are served on port 5000. Routes are defined using Flask-Classful across three files in the airscanmodemmanager repository:
| Route base | Source file | Description |
|---|---|---|
/ | app/app.py | Root routes: health check (/), status (/status) |
/modem/ | app/modem/modemApi.py | Modem operations: AT commands, carrier connect, handoff, ICCID, signal info, band unlock |
/system/ | app/system/systemApi.py | System utilities: ping via specific interface |
cts/production) with hostname, configure=yes, wireguard_peer, apn, name_in_platform, cluster_name, site_name, and desired versions.source matching the sheet tab. Enable all relevant parameters:CONFIGURE_WIREGUARD=true — sets up VPN tunnelCONFIGURE_AIRSCAN_MODEM_MANAGER=true — deploys modem managerCONFIGURE_RDF_AGENT=true — registers DB elements and deploys agentsudo wg show on rdflb_servercurl http://{wireguard_ip}:5000/modem/at (via tunnel)docker ps | grep rdf on the devicesnmp_manager.network_element and orchestrator.elementrdf_agent_version or airscan_modem_manager_version in the Google Sheet row.CONFIGURE_RDF_AGENT or CONFIGURE_AIRSCAN_MODEM_MANAGER).rdf_agent_version_actual / airscan_modem_manager_version_actual).| Image | Registry Path | Build |
|---|---|---|
| Modem Manager | registry.errigal.com/airscan/airscanmodemmanager:{version} | Jenkins (jenkinsCommon), multi-arch (amd64, arm64, arm/v8) |
| RDF Agent | registry.errigal.com/rdf_agent:{version} | Drone CI, JAR uploaded to S3 |
| Autoconfiguration | registry.errigal.com/airscanautoconfiguration:{tag} | Built during pipeline run |
| Ansible runner | registry.errigal.com/ansibledockerimage:latest | Pre-built image for running playbooks |
Check in order:
SSH to rdflb_server and check if the device's peer is active:
sudo wg show # Look for the device's peer — check "latest handshake" time
The device's WireGuard IP: {internal_subnet base}.{wireguard_peer + 1}
Example: internal_subnet=10.13.20.0, wireguard_peer=20 → IP 10.13.20.21
Check the sheet tab for the customer:
configure = yes?name_in_platform correct and matching the DB?wireguard_ip and wireguard_peer correct?-- Find the device by IP SELECT id, name, ip_address, on_air, cluster_name, site_name FROM snmp_manager.network_element WHERE ip_address = '{wireguard_ip}'; -- Check for duplicates by name SELECT id, name, ip_address, on_air, cluster_name FROM snmp_manager.network_element WHERE name LIKE '%{device_identifier}%';
-- Find element by entry_point_id (= network_element.id) SELECT e.id, e.entry_point_id, e.external_ip, e.internal_ip, e.on_air, e.customer_site_id, e.task_processing_agent_override FROM orchestrator.element e WHERE e.entry_point_id = {ne_id}; -- Check for duplicate elements by IP SELECT e.id, e.entry_point_id, e.external_ip, cs.name AS site_name FROM orchestrator.element e JOIN orchestrator.customer_site cs ON e.customer_site_id = cs.id WHERE e.external_ip = '{wireguard_ip}';
SSH to the device (via ProxyJump through rdflb_server):
# Check both containers docker ps # RDF Agent logs docker logs rdfagent # Modem Manager logs docker logs airscanmodemmanager
SELECT * FROM snmp_manager.expected_heartbeat WHERE network_element_id = {ne_id};
Symptom: Customer changed the device name in the platform UI. Device may stop working or show stale data.
What happens:
network_element name changesafterUpdate() fires, creating a network_element_change_sync recordentry_point_id (not name), so the element updates correctly
However, if the pipeline is re-run with a different name_in_platform:
network_element is NOT updated by the playbookIf duplicates exist:
-- Find duplicate network_elements SELECT id, name, ip_address, on_air, cluster_name FROM snmp_manager.network_element WHERE technology = 'AirScan' AND (name LIKE '%{old_name}%' OR name LIKE '%{new_name}%' OR ip_address = '{wireguard_ip}'); -- Find duplicate orchestrator elements SELECT e.id, e.entry_point_id, e.external_ip, e.on_air, e.customer_site_id, cs.name AS site_name FROM orchestrator.element e LEFT JOIN orchestrator.customer_site cs ON e.customer_site_id = cs.id WHERE e.external_ip = '{wireguard_ip}' OR e.entry_point_id IN ( SELECT id FROM snmp_manager.network_element WHERE name LIKE '%{old_name}%' OR name LIKE '%{new_name}%' );
The correct element should have:
entry_point_id matching the snmp_manager.network_element.id for that IPcustomer_site_id matching the correct sitetask_processing_agent_override pointing to the correct agent
Update name_in_platform in the Google Sheet to match, then re-run the pipeline if needed.
| Failure | Cause | Fix |
|---|---|---|
| Vault password error | {envVar}_ansible_vault_pass missing in Jenkins | Add credential in Jenkins |
| Sheet access denied | Service account lacks access | Share sheet with scotty@environment-app-versions.iam.gserviceaccount.com |
| Tab not found | source param doesn't match sheet tab name | Verify tab name matches {customer}/{sheet_name} exactly |
| Missing host groups | hosts.ini lacks rdf-orchestrator, rdf-lb, etc. | Update env-configuration/{envVar}/hosts.ini |
| WireGuard timeout | Peer unreachable or interface down | Check wireguard_server, peer config, firewall |
| Element INSERT fails | Cluster or site doesn't exist in DB | Create cluster/site first, or check cluster_name/site_name in sheet |
| Docker pull fails | Registry auth or image not found | Check errigal_docker_registry credentials and image tag |
| Component | Location |
|---|---|
| SnmpManager | Application logs — search for RDFElementSync entries |
| Orchestrator | Application logs — search for Updating Element with EntryPointId |
| Jenkins | Build console output — includes debug from inventory generation |
| WireGuard | sudo wg show on wireguard_server or journalctl -u wg-quick@{interface} |
| Modem Manager | docker logs on device, or /var/log/airscanmodemmanager_recovery/ for recovery |
| RDF Agent | docker logs on device, or file at LOGGING_FILE_PATH |
Each customer has:
env-configuration/{customer}/hosts.ini — main infrastructure inventoryenv-configuration/{customer}/group_vars/all/30_all.yml — environment variables{customer}/… — dynamic AirScan configuration
The pipeline generates a dynamic inventory at env-configuration/{customer}/{invFile}.yml from the Google Sheet.
| Item | Value |
|---|---|
| hosts.ini | env-configuration/cts/hosts.ini — defines ctsapps1/2, ctslb1, ctsoat1/2, ctsesk1/2 |
| Google Sheet tab | cts/production — dynamic config with EAS-prefixed hostnames |
| OAT servers | ctsoat1 (10.0.87.65), ctsoat2 (10.0.87.115) |
| DB host | cts-master-prod.cl0y2kknu458.us-east-1.rds.amazonaws.com |
| WireGuard | port 51822, network “cts”, subnet 10.13.20.0 |
| Orchestrator URL | http://10.13.20.2:8079 |