Overview
Camera-only localization of small floating buoys using Deep RL and YOLOv11.
This project presents a simulation-to-reality pipeline for training an Autonomous Surface Vehicle to visually locate small monitoring buoys in realistic marine conditions.
The approach combines Unity HDRP synthetic data generation, YOLOv11 detection, and PPO-based reinforcement learning with curiosity and LSTM memory to learn efficient search behavior.
Results
Key metrics
Detection quality
YOLOv11 trained on 2,000 synthetic images reached 99.5% mAP@50 and 99.7% recall.
Search performance
The learned policy achieved similar success rates while detecting the buoy 22.7% faster than a tuned Archimedean spiral baseline.
Engineering contribution
Validated automatic synthetic dataset generation, exploration-driven reward shaping, and systematic learned-vs-classical evaluation.
Scope
What this framework enables
Autonomous marine monitoring
Supports fast localization of floating monitoring equipment when GPS cues are coarse or intermittent.
Reusable methodology
The same pipeline can be adapted to related search scenarios with different targets and sensing constraints.