AI Pipeline

Detect, georeference, review.
Air-gapped or cloud-connected.

GeoFMV's AI pipeline runs entirely on a background thread, extracting frames synchronized with KLV telemetry, running inference in either offline ONNX or remote OpenRouter mode, and transforming pixel-space detections to WGS 84 geographic coordinates.

ONNX Runtime OpenRouter API Air-Gap Ready MISB ST 0903
End-to-End Flow

From decoded frame to georeferenced detection

πŸ“Ή
Step 1
Frame Extraction
Time-based, telemetry-triggered, or manual
πŸ”‘
Step 2
KLV Sync
Attach nearest KLV telemetry packet
🧠
Step 3
Inference
ONNX local or OpenRouter cloud
πŸ“
Step 4
Georeferencing
Homography or ray-casting to WGS 84
πŸ—Ί
Step 5
Map + VMTI
Detection layer + MISB ST 0903 output
Inference Mode

Local ONNX vs Remote OpenRouter

Switch between modes from the AI Analysis dock. The rest of the pipeline β€” frame extraction, KLV sync, georeferencing, and VMTI encoding β€” is identical regardless of mode.

πŸ”’

Local Mode β€” ONNX Runtime

Zero network calls Β· Air-gap compliant Β· GPU accelerated
Air-Gap
βœ” Load any .onnx model file from the local filesystem
βœ” Pre-process: letterbox resize to 640Γ—640, normalize [0,1], HWCβ†’CHW
βœ” Post-process output tensor [1, 84, 8400] β†’ NMS β†’ detections
βœ” GPU acceleration via DirectML (Windows) or CUDA
βœ” Falls back to CPU silently when no GPU is present
βœ” Ships with a default YOLOv11 general-object model
βœ” Accepts any user-supplied compatible ONNX model
# Python (QGIS)
import onnxruntime as ort
session = ort.InferenceSession(
    "yolov11.onnx",
    providers=["CUDAExecutionProvider",
               "CPUExecutionProvider"]
)
outputs = session.run(None,
    {"images": preprocessed_tensor}
)
# Apply NMS and confidence filter
detections = postprocess(outputs, conf=0.5, iou=0.45)
🌐

Remote Mode β€” OpenRouter API

Multimodal vision models Β· Async POST Β· Exponential back-off
Cloud
βœ” Encode frame as JPEG β†’ Base64
βœ” Construct OpenAI Chat Completions JSON payload
βœ” Structured detection prompt requesting bbox + label + confidence JSON
βœ” Parse model JSON response β†’ pixel-space bounding boxes
βœ” 3-attempt retry with 1s/2s/4s exponential back-off
βœ” API key retrieved exclusively from platform Credential Manager
# Python (QGIS) β€” async httpx client
payload = {
  "model": "qwen/qwen3-vl",
  "messages": [{
    "role": "user",
    "content": [{
      "type": "image_url",
      "image_url": {"url": f"data:image/jpeg;base64,{b64}"}
    }, {
      "type": "text",
      "text": "Detect objects. Return JSON array: [{bbox:[x,y,w,h], label, confidence}]"
    }]
  }]
}
Supported Models

Local and cloud inference options

ModelModeProviderBest ForAir-GapConfig
YOLOv11 (default) Local Ships with GeoFMV Fast object detection, vehicles, personnel, infrastructure βœ” Loaded from local .onnx path
Custom ONNX model Local User-supplied Domain-specific detection (aircraft, vessels, damage) βœ” Any YOLO-compatible ONNX
QWEN3-VL Remote Alibaba via OpenRouter High-accuracy object detection, scene analysis, OCR βœ— qwen/qwen3-vl
Llama-3.2-Vision Remote Meta via OpenRouter General vision, scene captioning, VQA βœ— meta-llama/llama-3.2-vision
ERNIE 4.5 Remote Baidu via OpenRouter OCR, text spotting on signage and vehicle markings βœ— baidu/ernie-4.5
GPT-4.1-mini Remote OpenAI direct Scene summarization, fallback detection βœ— openai/gpt-4.1-mini

Georeferencing Engine

Pixel β†’ WGS 84 in two algorithms

Primary β€” <15 m CEP

Homography from KLV Corner Coordinates

When MISB ST 0601 Tags 82–89 (Corner Latitude/Longitude Points 1–4) are present, the georeferencing engine computes a perspective transform (homography matrix) from the four frame pixel corners to the four geographic corners. Any pixel coordinate β€” including AI bounding box centres β€” can then be projected to a WGS 84 Lat/Lon with sub-pixel accuracy.

# Compute 3Γ—3 perspective transform matrix
src_pts = np.array([[0,0],[w,0],[w,h],[0,h]], dtype=float32)
dst_pts = np.array([nw, ne, se, sw])  # KLV Tags 82-89
M, _ = cv2.findHomography(src_pts, dst_pts)

# Transform bounding box centre pixel β†’ Lat/Lon
geo = cv2.perspectiveTransform(
    np.array([[[px, py]]], dtype=float32), M
)[0][0]  # β†’ [longitude, latitude]
Fallback

Ray-Casting from Sensor Pose

When Corner Coordinates are absent, the engine projects a unit look vector from the sensor's 3D position (Tags 13/14/15), rotated by the platform attitude (Tags 5/6/7) and adjusted for the pixel's position relative to the sensor FOV (Tags 16/17). The ray intersects the active DEM layer or the WGS 84 reference ellipsoid, and the intersection point is converted from ECEF to geodetic coordinates.

Inputs: Sensor Lat/Lon/Alt Β· Platform H/P/R Β· Sensor HFOV/VFOV Β· Slant Range
DEM intersection (if active) β†’ terrain-corrected ground position
Flat-earth fallback if DEM intersection fails

Accuracy Note

With homography (corner coordinates present) and sensor altitude between 100–5,000 m AGL: <15 m CEP.
Homography residual is stored on each detection for quality assessment. Residuals >50 m trigger a warning but the detection is not discarded.


Frame Extraction

Three extraction strategies

⏱

Time-Based

Extract one frame per configurable interval. Default: 1 fps. Range: 0.1–30 fps. For a 10-minute video at 1 fps β†’ 600 frames submitted for inference.

πŸ“‘

Telemetry-Triggered

Extract a frame whenever the Sensor HFOV changes by more than a configurable threshold (default: 2Β°) between consecutive KLV packets. Captures scene changes due to zoom or attitude shifts.

🎯

Manual / Batch

Analyst triggers extraction of the current frame on demand, or initiates full-video batch processing from start to finish. Progress indicator shows percentage and ETA.


Security (Requirement 20)

API key handling

QGIS β€” Auth Manager

The OpenRouter API key is stored in the QGIS Authentication Manager (QgsAuthManager) as an authcfg entry. It is never written to any config file, log file, or environment variable. The settings dialog allows entry, update, and deletion without displaying the key in plaintext.

ArcGIS Pro β€” Windows Credential Manager

On ArcGIS Pro, the key is stored in the Windows Credential Manager via Windows.Security.Credentials.PasswordVault (DPAPI). The key is retrieved at runtime from the vault β€” it never touches disk. When Local Mode is active, zero outbound AI network calls are made.