Autonomy ResearchUSRCAP Summer 2025LION Lab · University of Toledo

AeroSynapse:
Graph-Based RL for UAV Autonomy

An edge-first UAV autonomy framework combining graph-based reinforcement learning, meta-learning, PINN-based sensor fusion, and human-in-the-loop control for navigation in GPS-denied, map-free environments.

73%

Zero-shot navigation success

89%

Success after 5 adaptation episodes

<20ms

Control-loop latency (HIL)

<500ms

Voice-to-action latency

1000+

Randomized simulated environments

95%

Operational cost vs. cloud alternatives

Research Report ↓← All Projects

Overview

AeroSynapse investigates a hard autonomy problem: how can a UAV operate in a new environment without GPS, pre-mapping, or cloud inference? The project treats UAV autonomy as a layered system rather than a single model — perception, graph construction, control, learning, safety, and human supervision are separated into coordinated modules.

The core idea is to represent the UAV's local environment as a dynamic graph. Obstacles, targets, waypoints, free-space regions, and semantic objects become nodes; spatial and risk relationships become edges. A graph-based RL policy then reasons over this structure to make navigation decisions.

The system prioritizes onboard edge compute over cloud dependency, reducing latency and preserving autonomy in degraded or denied communication environments.

ProjectAeroSynapse — Graph-Based RL for UAV Autonomy

ProgramUSRCAP Summer 2025 · $3,000 Fellowship

InstitutionLION Lab, University of Toledo

MentorDr. Liang Cheng

RoleAI & Autonomy Architecture Lead

HardwareJetson Orin NX · Cube Orange+ · OAK-D Pro

Research Question

“Can a UAV enter an unfamiliar environment and make useful navigation decisions without being pre-trained on that exact space?”

Problem

Most autonomous UAV systems are fragile outside controlled environments.

They depend on GPS, pre-built maps, stable internet, or carefully controlled conditions. In disaster response, indoor inspection, warehouse automation, and GPS-denied defense environments, these assumptions break down.

GPS / GNSS Dependency

Conventional systems lose localization immediately when GPS is unavailable — a common condition indoors, underground, or in contested environments.

Pre-Built Map Requirement

SLAM-based systems require pre-mapping runs. In emergency scenarios or degraded environments, no prior map is available.

Cloud Inference Latency

Cloud-dependent AI adds hundreds of milliseconds of latency and becomes unavailable when communication is denied.

Environment-Specific Training

Models trained on one environment fail to generalize. Re-training for each new deployment is impractical at operational scale.

Sim-to-Reality Gap

Deep RL policies can achieve near-perfect simulation scores while failing entirely when facing real-world sensor noise and dynamics.

No Safe Failure Mode

When constraints are violated, most autonomy stacks have no principled fallback — leading to unpredictable or catastrophic failure.

Design Principle

Autonomy should not collapse when the environment is unknown. AeroSynapse was designed around this exact failure case.

Architecture

A layered autonomy stack — each module coordinates, none is monolithic.

Rather than one large model, AeroSynapse decomposes autonomy into coordinated layers: sensing, state estimation, graph representation, decision-making, adaptation, safety, and human interaction.

01

Sensor Input

Stereo / Depth CameraLiDARIMUEnvironmental SensorsVoice Input (optional)

02

Perception & State Estimation

Visual PerceptionObstacle DetectionPINN-Based State EstimationPhysically Consistent Prediction

03

Dynamic Graph Construction

Obstacle NodesFree-Space RegionsWaypoint / Goal NodesSemantic ObjectsRisk & Visibility EdgesReal-Time Updates

04

Graph-Based Reinforcement Learning

GNN Policy (Graph Attention)Q-Prop Actor-CriticReward: Safety · Progress · Stability · MissionRelational Spatial Reasoning

05

Meta-Learning & Adaptation

MAML-Style Training1000+ Randomized EnvsZero-Shot DeploymentFew-Shot Adaptation (5 episodes)

06

Safety & Runtime Assurance

Runtime State MonitorConstraint CheckingControl-Barrier BoundariesBaseline Controller FallbackHuman Override

07

Human-in-the-Loop Interface

Direct Control ModeSupervised AutonomyNatural-Language CommandsIntervention & Recovery

08

Edge Deployment

Jetson Orin NX (onboard)ArduPilot / Cube Orange+ROS 2 Modular CoordinationLocal Inference (no cloud)

Validation method: Hardware-in-the-loop simulation with edge deployment assumptions centered on Jetson Orin NX-class compute. Progressive testing protocol with domain randomization for sim-to-reality transfer evaluation.

Technical Stack

Core Autonomy

Graph-Based RLGraph Neural NetworksGraph Attention NetworksQ-Prop Actor-CriticMAML Meta-LearningPhysics-Informed Neural NetworksModel Predictive ControlControl Barrier FunctionsRuntime Assurance

Perception & AI

Stereo / Depth VisionLiDAR Obstacle DetectionVision-Language ModelsSAM-Style SegmentationLocal Language-Action ModelVoice Command PipelineSHAP / LIME Explainability

Robotics & Hardware

ROS 2ArduPilotMAVLink / MAVROSJetson Orin NXCubePilot Cube Orange+OAK-D Pro FFSLAMTEC C1 LiDARAirSim (simulation)

Research Methods

Domain RandomizationSim-to-Reality TransferZero-Shot EvaluationFew-Shot EvaluationHardware-in-the-Loop TestingSafety & Failure-Mode AnalysisProgressive Testing Protocol

Key Results

Validation outcomes from hardware-in-the-loop testing.

Results support the core research direction: graph-structured learning and meta-learning produce UAV autonomy that is more adaptable than systems trained for one fixed environment.

73%

Zero-shot navigation success in unseen environments

89%

Navigation success after five adaptation episodes

<20ms

Control-loop latency in hardware-in-the-loop validation

<500ms

Voice-to-action interaction latency

1000+

Randomized environments used for meta-learning training

13%

Simulation-to-reality transfer gap in reported validation

Safety

100%

Collision-free testing across all evaluated safety scenarios in the reported validation.

Infrastructure Cost

95% ↓

Estimated operational cost reduction by replacing cloud-dependent inference with edge-first architecture.

Key Insights

Why Graph-Based RL

Navigation is a relational problem. A flat perception model sees objects — a graph sees relationships between obstacles, goals, risk, and motion constraints. UAV navigation is naturally relational: which gap is safest? which path preserves battery? which region should be avoided? Graphs make these relationships explicit. By using graph-based RL, AeroSynapse gives the UAV a stronger representation for decisions in cluttered and unfamiliar environments.

Generalization over Peak Performance

A policy that scores 99% in one simulation environment but fails immediately in any other is not useful for real deployment. Meta-learning shifts the objective from memorization to adaptation.

Edge Deployment Changes Every Design Decision

Inference time, model size, power consumption, and memory constraints are hard limits on a UAV. Every AI component must be designed with those constraints from the beginning.

Safety Must Be Designed Around the Policy

A safety layer added after training is already too late. Runtime assurance, control barriers, and fallback controllers must be designed as first-class components of the autonomy stack.

Graph Representations Enable Relational Reasoning

The most valuable thing about the graph representation isn't performance — it's interpretability. An operator can understand a graph of obstacles and goals in a way they cannot understand an opaque neural network state.

Human Oversight Is a Control Layer, Not a Weakness

Full autonomy is not always the right answer. The human-in-the-loop modes exist because some decisions require human judgment, and a system that removes that option is more dangerous, not more capable.

Credible Research Requires Measurable Claims

The most important discipline in this project was separating what the architecture is designed to achieve from what has actually been validated. Strong framing with honest limitations is what makes research credible.

Human-in-the-Loop Design

The goal is not to remove the human — it's to put the human at the right level of control.

AeroSynapse uses a tiered human-in-the-loop model. Each mode reflects a different operating condition and trust level in the autonomous system.

Direct Control

The human operator controls the UAV manually. The autonomy stack is passive. Used during high-risk phases or when the environment is completely novel.

Supervised Autonomy

The UAV proposes or executes navigation actions while the human monitors. The operator can intervene at any point without taking full manual control.

Intervention Mode

The system detects uncertainty, safety constraint proximity, or mission ambiguity and requests human input before continuing autonomous operation.

Natural-Language Command Mode

The operator gives high-level mission instructions. The voice-to-action pipeline translates these into safe flight objectives within the current graph representation.

Safety Mechanisms

Runtime State MonitoringCollision-Distance ConstraintsAltitude LimitsBattery Safety BoundsEmergency FallbackHuman OverrideBaseline Controller TakeoverDecision Logging

Future Work

Turning the research architecture into a stronger experimental platform.

01

Real-Time Graph Pipeline

Implement a minimal real-time graph-construction pipeline on the Jetson Orin NX and measure latency against the <20ms control-loop target.

02

GNN Policy Benchmark

Train and benchmark a small GNN policy against PPO/SAC baselines on the same randomized environment suite to quantify the graph representation advantage.

03

Indoor Test Course

Build a simplified indoor test course with repeatable obstacle layouts to ground the simulation metrics in real flight data.

04

Sim-to-Real Gap Analysis

Add real flight logs and compare against simulation performance to produce an empirical sim-to-reality gap estimate beyond the current 13%.

05

Safety-Gated ArduPilot Integration

Integrate safety-gated autonomy with ArduPilot flight modes so the constraint layer can trigger failsafe behaviors through the standard flight controller interface.

06

Multi-Agent Extension

Extend the single-UAV graph-based RL framework to cooperative multi-agent UAV navigation for search, coverage, and formation tasks.

AeroSynapse

Edge AI for the environments that matter most.

Graph learning, meta-learning, safety assurance, and human-AI teaming — composed into a practical autonomy stack for GPS-denied, map-free UAV navigation.

Download Research Report ↓← Back to Portfolio

AeroSynapse:Graph-Based RL for UAV Autonomy