Active: 2025 - Present
Summary
Following up on work with NASA and IBM on the Prithvi Geospatial Foundation Model, this project aims to explore what geospatial foundation models learn through interactive exploration of the embedding space.
Abstract
Geospatial Foundation Models (GFMs) have proliferated in the few years since their introduction, leveraging petabyte-scale Earth Observation (EO) datasets to encode complex spectral, temporal, and spatial relationships. On existing benchmarks, GFMs approach or surpass the state-of-the-art performance by traditional, purpose-built models. However, assessing pretrained models on existing performance benchmarks requires fine-tuning a purpose-built decoder and head to extract predictions from the latent representational space, decomposing high-information embeddings and increasing computational demands. Additionally, the ground reference (aka labels) of benchmark datasets can be noisy, prone to human error, and simplify real-world complexities such as uncertainty or intra-class differentiation. A new type of benchmark dataset is needed to explore, explain, and compare models’ latent space and assess how embeddings can be used for lightweight model deployment. This study introduces a benchmark multi-modal, multi-temporal dataset for Geospatial Exploration of Latent Observation Space (GELOS) for GFMs. The dataset comprises multiple observations per year of multi-sensor multi-spectral imagery, Synthetic Aperture Radar (SAR), and Digital Elevation Model (DEM) data formatted for maximum interoperability with existing GFMs. Representative samples are chosen for typical downstream tasks such as land cover classification and change detection. Rather than being used as targets for model outputs as in traditional benchmark datasets, these categories are used to explore how GFMs encode and cluster samples in the latent space. Using this approach, model performance on downstream tasks can be predicted, explained, and compared, expanding our understanding of what patterns in EO data the GFMs are able to capture.
Goals
- Generate a benchmark multi-sensor, multitemporal dataset of pure land cover scenes to explore class separability and inter-class differences between models.
- Create a reusable workflow for transforming, analyzing, and visualizing embedding outputs of a range of geospatial foundation models.
- Develop measures for embedding comparison across the latent space, enabling comparative analysis of distinct models and inputs.
Outcomes
- Created benchmark dataset for embedding comparison distinguised by land cover class.
- Created reusable GELOS package, applicable to custom datasets and models.
Code Repositories
This repository contains the core analysis pipeline for GELOS, which can be applied to multiple datasets and models. Config driven for repeatable experiments to perform embedding generation → transformation → analysis → modeling → visualization.
Repository for generating the benchmark multi-sensor, multitemporal dataset of pure land cover classes. Config-driven with notebooks for demonstration and debugging. Leverages pystac to query STAC catalogues and filter for chips with all required data.
Example application of the core GELOS analysis package to the benchmark land cover dataset. Generates embeddings for multiple models, transforms, plots, and uses embeddings for simple downstream classification.
Online Stories
Sprinting to space: Goddard, NASA, and Clark’s pathbreaking work in geospatial analytics
Presented on Geospatial Exploration of Latent Embedding Space (GELOS) at the Cloud Native Geospatial sprint, collaborating on defining conventions for geospatial embeddings: https://geoembeddings.org/
Shaping the Future of Earth Science: Clark CGA at AGU 2025
Presenting at AGU 2025 for Geospatial Exploration of Latent Embedding Space (GELOS).
Conference Presentations
- Godwin, D., Khallagi, S., Balogun, R., Yao, Y., Roy, S., Ramachandran, R., Alemohammad, H. (2025, December). GELOS: A Benchmark Dataset for Geospatial Exploration of Latent Observation Space [Poster presentation]. AGU Annual Meeting 2025, New Orleans, LA, USA.