Geospatial Vision-Language Models for Earth Observation and Remote Sensing Analysis
GeoVLMs represents our comprehensive research initiative in developing advanced vision-language models specifically designed for geospatial and remote sensing applications. This project encompasses cutting-edge approaches to enable AI systems to understand, reason about, and interact with Earth observation data through natural language interfaces and multi-modal understanding.
Our work focuses on bridging the gap between computer vision, natural language processing, and geospatial intelligence, enabling more intuitive and powerful tools for analyzing satellite imagery, understanding environmental changes, and supporting decision-making in Earth observation tasks.
EarthDial transforms multi-sensory Earth observations into interactive dialogues, enabling natural language interactions with satellite imagery and remote sensing data. The system allows users to query, analyze, and understand complex geospatial information through conversational interfaces.
GeoVLM-R1 is a reinforcement learning framework that enhances vision-language models' reasoning capabilities for Earth observation tasks. The system is designed with flexibility, scalability, and ease of experimentation in mind, enabling advanced reasoning in diverse remote sensing scenarios.