AI Pipeline — GeoFMV Desktop Add-Ons

Inference Mode

Local ONNX vs Remote OpenRouter

Switch between modes from the AI Analysis dock. The rest of the pipeline — frame extraction, KLV sync, georeferencing, and VMTI encoding — is identical regardless of mode.

🔒

Local Mode — ONNX Runtime

Zero network calls · Air-gap compliant · GPU accelerated

Air-Gap

✔ Load any .onnx model file from the local filesystem

✔ Pre-process: letterbox resize to 640×640, normalize [0,1], HWC→CHW

✔ Post-process output tensor [1, 84, 8400] → NMS → detections

✔ GPU acceleration via DirectML (Windows) or CUDA

✔ Falls back to CPU silently when no GPU is present

✔ Ships with a default YOLOv11 general-object model

✔ Accepts any user-supplied compatible ONNX model

# Python (QGIS)
import onnxruntime as ort
session = ort.InferenceSession(
    "yolov11.onnx",
    providers=["CUDAExecutionProvider",
               "CPUExecutionProvider"]
)
outputs = session.run(None,
    {"images": preprocessed_tensor}
)
# Apply NMS and confidence filter
detections = postprocess(outputs, conf=0.5, iou=0.45)

🌐

Remote Mode — OpenRouter API

Multimodal vision models · Async POST · Exponential back-off

Cloud

✔ Encode frame as JPEG → Base64

✔ Construct OpenAI Chat Completions JSON payload

✔ Structured detection prompt requesting bbox + label + confidence JSON

✔ Parse model JSON response → pixel-space bounding boxes

✔ 3-attempt retry with 1s/2s/4s exponential back-off

✔ API key retrieved exclusively from platform Credential Manager

# Python (QGIS) — async httpx client
payload = {
  "model": "qwen/qwen3-vl",
  "messages": [{
    "role": "user",
    "content": [{
      "type": "image_url",
      "image_url": {"url": f"data:image/jpeg;base64,{b64}"}
    }, {
      "type": "text",
      "text": "Detect objects. Return JSON array: [{bbox:[x,y,w,h], label, confidence}]"
    }]
  }]
}

Supported Models

Local and cloud inference options

Model	Mode	Provider	Best For	Air-Gap	Config
YOLOv11 (default)	Local	Ships with GeoFMV	Fast object detection, vehicles, personnel, infrastructure	✔	Loaded from local `.onnx` path
Custom ONNX model	Local	User-supplied	Domain-specific detection (aircraft, vessels, damage)	✔	Any YOLO-compatible ONNX
QWEN3-VL	Remote	Alibaba via OpenRouter	High-accuracy object detection, scene analysis, OCR	✗	`qwen/qwen3-vl`
Llama-3.2-Vision	Remote	Meta via OpenRouter	General vision, scene captioning, VQA	✗	`meta-llama/llama-3.2-vision`
ERNIE 4.5	Remote	Baidu via OpenRouter	OCR, text spotting on signage and vehicle markings	✗	`baidu/ernie-4.5`
GPT-4.1-mini	Remote	OpenAI direct	Scene summarization, fallback detection	✗	`openai/gpt-4.1-mini`

Georeferencing Engine

Pixel → WGS 84 in two algorithms

Primary — <15 m CEP

Homography from KLV Corner Coordinates

When MISB ST 0601 Tags 82–89 (Corner Latitude/Longitude Points 1–4) are present, the georeferencing engine computes a perspective transform (homography matrix) from the four frame pixel corners to the four geographic corners. Any pixel coordinate — including AI bounding box centres — can then be projected to a WGS 84 Lat/Lon with sub-pixel accuracy.

# Compute 3×3 perspective transform matrix
src_pts = np.array([[0,0],[w,0],[w,h],[0,h]], dtype=float32)
dst_pts = np.array([nw, ne, se, sw])  # KLV Tags 82-89
M, _ = cv2.findHomography(src_pts, dst_pts)

# Transform bounding box centre pixel → Lat/Lon
geo = cv2.perspectiveTransform(
    np.array([[[px, py]]], dtype=float32), M
)[0][0]  # → [longitude, latitude]

Fallback

Ray-Casting from Sensor Pose

When Corner Coordinates are absent, the engine projects a unit look vector from the sensor's 3D position (Tags 13/14/15), rotated by the platform attitude (Tags 5/6/7) and adjusted for the pixel's position relative to the sensor FOV (Tags 16/17). The ray intersects the active DEM layer or the WGS 84 reference ellipsoid, and the intersection point is converted from ECEF to geodetic coordinates.

Inputs: Sensor Lat/Lon/Alt · Platform H/P/R · Sensor HFOV/VFOV · Slant Range
DEM intersection (if active) → terrain-corrected ground position
Flat-earth fallback if DEM intersection fails

Accuracy Note

With homography (corner coordinates present) and sensor altitude between 100–5,000 m AGL: <15 m CEP.
Homography residual is stored on each detection for quality assessment. Residuals >50 m trigger a warning but the detection is not discarded.

Frame Extraction

Three extraction strategies

⏱

Time-Based

Extract one frame per configurable interval. Default: 1 fps. Range: 0.1–30 fps. For a 10-minute video at 1 fps → 600 frames submitted for inference.

📡

Telemetry-Triggered

Extract a frame whenever the Sensor HFOV changes by more than a configurable threshold (default: 2°) between consecutive KLV packets. Captures scene changes due to zoom or attitude shifts.

🎯

Manual / Batch

Analyst triggers extraction of the current frame on demand, or initiates full-video batch processing from start to finish. Progress indicator shows percentage and ETA.

Security (Requirement 20)

API key handling

QGIS — Auth Manager

The OpenRouter API key is stored in the QGIS Authentication Manager (QgsAuthManager) as an authcfg entry. It is never written to any config file, log file, or environment variable. The settings dialog allows entry, update, and deletion without displaying the key in plaintext.

ArcGIS Pro — Windows Credential Manager

On ArcGIS Pro, the key is stored in the Windows Credential Manager via Windows.Security.Credentials.PasswordVault (DPAPI). The key is retrieved at runtime from the vault — it never touches disk. When Local Mode is active, zero outbound AI network calls are made.

Detect, georeference, review.
Air-gapped or cloud-connected.

From decoded frame to georeferenced detection

Local ONNX vs Remote OpenRouter

Local Mode — ONNX Runtime

Remote Mode — OpenRouter API

Local and cloud inference options

Pixel → WGS 84 in two algorithms

Homography from KLV Corner Coordinates

Ray-Casting from Sensor Pose

Accuracy Note

Three extraction strategies

Time-Based

Telemetry-Triggered

Manual / Batch

API key handling

QGIS — Auth Manager

ArcGIS Pro — Windows Credential Manager

Detect, georeference, review.Air-gapped or cloud-connected.

From decoded frame to georeferenced detection

Local ONNX vs Remote OpenRouter

Local Mode — ONNX Runtime

Remote Mode — OpenRouter API

Local and cloud inference options

Pixel → WGS 84 in two algorithms

Homography from KLV Corner Coordinates

Ray-Casting from Sensor Pose

Accuracy Note

Three extraction strategies

Time-Based

Telemetry-Triggered

Manual / Batch

API key handling

QGIS — Auth Manager

ArcGIS Pro — Windows Credential Manager

Detect, georeference, review.
Air-gapped or cloud-connected.