GSoC 2026 Proposal: Spatio-Temporal ML Bridge for Multi-Sensor Fusion

Who I Am and Why I’m Here

Hello Grass Team,

I’m Selma ^^ , a final year computer vision Master’s student. I’m working specifically on open-vocabulary satellite segmentation with vision-language models, teaching a model to segment imagery from free-form text rather than fixed labels.

While developing my thesis, I discovered GRASS GIS and I realized that its TGIS framework is exactly the temporal infrastructure my workflow had been missing.

In my project (my Master’s) I found out that clouds are the main bottleneck in many time-series sequences, 40% or more of optical imagery is unusable.

The usual workaround… breaks temporal continuity

Well, SAR imagery solves this, but integrating SAR and optical data in a GIS framework has been painful !! there’s no clean path from a GRASS temporal dataset to a PyTorch DataLoader without exporting to disk, losing metadata, and writing brittle glue code.

This project addresses that gap, not the model itself, but the plumbing between GRASS’s temporal infrastructure and the deep learning stack I use every day.


The Actual Problem

GRASS has impressive temporal infrastructure. The TGIS framework for sure; it handles metadata, topology, and time-series queries.. in ways most GIS tools don’t attempt.

But that infrastructure stops at the edge of the NumPy/SciPy stack.. There’s no native bridge from a SpaceTimeRasterDataset query to a 4D PyTorch tensor!

There’s no built in way to align datasets across sensors for example pairing a SAR acquisition from one STRDS (a GRASS Space-Time Raster Dataset) with the nearest clear-sky optical map from another.

Yes, we can do it manually, but it usually means rewriting the same annoying data-handling scripts every time the kind of logic that ends up copy-pasted between experiments. At that point, it’s clear this logic belongs in the library, not scattered across research notebooks.

This is also a different problem from what r.learn.ml already solves.

Because it’s a tool designed for classical ML (scikit-learn), not deep learning, and works on single rasters, not temporal datasets.

So what we’re trying to propose is slightly different because it:

  • works with time-series data

  • combines multiple sensors (SAR + optical)

  • feeds the data directly into PyTorch models

  • while keeping everything inside the GRASS spatial system.

What I Found in the Codebase

While exploring whether RasterRowIO buffers could be piped directly into tensors (avoiding intermediate disk exports).

I noticed that PyGRASS already exposes row level raster access, but there’s a weird gap where those buffers don’t easily talk to PyTorch tensors without a lot of manual casting.

So the ‘bones’ are there, but the bridge is missing.

Then reading through abstract_space_time_dataset.py, I found that get_registered_maps_as_objects() on SpaceTimeRasterDataset is powerful for querying within a single STRDS, but it doesn’t provide cross-dataset temporal alignment.

In other words, pairing a SAR map from one STRDS with the nearest optical map from another isn’t built in yet.

The TGIS temporal topology engine already has the primitives for this. They just aren’t assembled into a cross-sensor workflow.

One thing I want to be upfront about is the memory alignment when moving raster data from the C-based GRASS internals into Python tensors.

Depending on the computational region and raster layout, the data can behave slightly differently once it’s converted. Because of this, I plan to prioritize the gunittest suite alongside the tiling engine, not after it. The goal is to catch edge cases early and make sure the data loader behaves consistently across different regions and datasets.

For the addon interface, I’ll follow the g.parser pattern used in modules such as r.slope.aspect, which provides a clear structure for building standard GRASS modules with CLI and GUI compatibility.

What I’m Actually Building

The minimum viable deliverable is two things: the grass.ml.temporal data loader API, and the cross-STRDS alignment utility.

If those two components land cleanly and pass the gunittest suite, the project is already successful regardless of stretch goals.

If the core infrastructure stabilizes early, I plan to extend it with:

  1. ML–temporal bridge: streams 512×512 model-ready tensors from PyGRASS while respecting g.region. Unlike r.tile, this is a streaming DataLoader interface.

  2. Cross-sensor alignment module: pairs SAR maps from one STRDS with the nearest optical maps from another using the TGIS temporal topology engine.

  3. Benchmarking suite: tests the pipeline on a Sentinel-1 / Sentinel-2 flood dataset using SSIM, MAE, and cloud-mask reconstruction metrics.

  4. i.fusion.temporal addon: exposes the pipeline as a standard GRASS module via g.parser. A lightweight U-Net will serve as a baseline, but the framework remains model-agnostic.

  5. gunittest suite: validates the data loader and alignment modules to ensure stable behavior across regions and datasets.


You may Ask: Why GRASS and Not a Standalone Library

Well, I considered building this as a standalone PyTorch dataset class on top of GDAL. That would work technically, but it would lose several things that GRASS already solves well: TGIS temporal metadata, g.region CRS management, and compatibility with existing GRASS workflows..

So, the main reason for implementing this inside GRASS is that it complements and extends the existing ecosystem.

This ensures the reconstructed pseudo-optical layer works natively with GRASS tools like t.rast.algebra, r.series, and other temporal modules, instead of being stuck inside a standalone training script.

Coming from a segmentation background, that composability is exactly what I would want as a user of this tool.


Honest Assessment of Scope

350 hours is tight for everything listed here, and I know that. The reason I think it’s manageable is that the most critical parts: “the data loader and alignment utility” are relatively self contained.

If they take longer than expected, the stretch goals can scale down cleanly. The U-Net baseline becomes a minimal proof-of-concept, the benchmarking suite covers fewer scenarios, and the addon interface remains simpler, but the core infrastructure still delivers value to the GRASS ecosystem.

I came to this proposal through my own research, not just GSoC, which makes it genuinely exciting for me. If we land on this solution together, I’m motivated to explore further improvements and extensions even after the program ends, especially as new temporal satellite workflows evolve within GRASS.

P.S. I’ll write the full proposal with a detailed timeline, deliverable milestones, and revised scope once I hear back from you on whether the project direction works.

Do you plan on having a requirement on PyTorch, either for installations or in CI. It is known to be on the heavier side, and from my past experiences, sometimes a pain to install. It might have gone a long way since though

1 Like

I think we could have the conversation about the merits of the proposal but without any record of previous contribution to GRASS, it’s unlikely this would get selected.

1 Like

No, PyTorch won’t be forced on anyone, it would be an optional dependency.

The grass.ml.temporal module will only import it when the DataLoader interface is explicitly used, so users who don’t need the ML bridge won’t be affected, something like pip install grass[ml] rather than a hard requirement at install time.

For CI, it would run as a separate optional job, not the main test suite, to avoid the heavy installation.

Noted, and I appreciate the directness.
I’m already working through the codebase and will have something up before the deadline.

Hi Selma, and thanks for your first PRs. In order for them to work as a test of skills with regards to GSoC it would be an advantage if you tackle a real issue, see e.g. GitHub · Where software is built .You may also look for other topics relevant for your proposal, like imagery or good_first_issue…

It is not entirely clear to me what you plan as an end-user facing tool and what as developer facing API. Existing fuctions/methods like sample_by_dataset or enduser tools like t.sample already offer alignment solutions. What is it you are missing and what do you suggest to add.

Since I work with similar issues in my daily work, this may be of inspiration: GitHub - NVE/actinia_modules_nve: Repository to collect actinia modules / endpoints for use at NVE · GitHub we will be working on multi-sensor projects this / next year and currently use pytorch for cloud and snow cover detection in Sentinel-3, as well as avalanche detection from Sentinel-1…

As for loading data, did you notice: GitHub - lrntct/xarray-grass: An xarray backend for grass raster · GitHub

1 Like

Thank you for the detailed feedback. It is extremely helpful for sharpening my proposal.

So here are my findings:

1. Tool vs. API

Yes, you are right that the distinction between deliverables was not clear. I will separate them explicitly:

  • API (developer-facing): grass.ml.temporal is a Python package that developers can import to stream GRASS temporal datasets directly into PyTorch DataLoaders without writing brittle glue code.

  • Tool (end-user facing): i.fusion.temporal is a standard GRASS module (via g.parser) that command-line or GUI users can run to perform tasks like multi-sensor fusion or cloud reconstruction.

2. The precise gap is a cross-sensor streaming pipeline

t.sample and sample_by_dataset handle temporal alignment, yes, but they output metadata and map names, not pixel values. They tell which maps overlap, but they do not read data into tensors. My project introduces a dedicated bridge for cross-sensor streaming that:

  • Disk-less streaming: Moves RasterRowIO buffers directly into tensors, avoiding intermediate disk exports.

  • Tiled 4D tensors: Streams model-ready tiles (for example, 512 by 512 pixels) while respecting g.region. This is distinct from r.tile, which only handles spatial tiling without temporal or multi-sensor dimensions.

  • Topological alignment: Leverages the TGIS temporal topology engine to automatically pair disparate sensors (SAR with optical) based on temporal overlap, then streams aligned 4D tensors.

This step, turning TGIS metadata queries into PyTorch tensors with native tiling and multi-sensor pairing, does not exist anywhere currently to the best of my knowledge.

3. Validation with NVE workflows

The Sentinel-1 and Sentinel-3 examples you mentioned, specifically for avalanche and cloud or snow detection, are ideal real-world targets !
I will use these multi-sensor fusion scenarios to benchmark the pipeline’s ability to handle disparate temporal resolutions and SAR to optical alignment.

4. Relation to xarray-grass

Regarding xarray-grass, it is a solid backend for opening single rasters, but it does not handle cross-STRDS multi-sensor pairing.

Furthermore, it often suffers from the fork-safety deadlock when used with PyTorch DataLoader with num_workers greater than zero. This is a known issue (see neural-lam/issues/198) where Dask thread pools initialized in the main process cause forked workers to hang on inherited locks. By building a native GRASS-to-tensor pipeline, we can bypass these third-party abstractions and provide a much more stable, multi-threaded data loader for the GRASS ecosystem.

I will revise the proposal to make these distinctions and the technical gap clearer.

Thanks again for pushing me to sharpen this ^^

Hi,

And thanks for that additional info.

While that sounds interesting, I have to admit that I still struggle a bit what you ate aiming at. For example, what is the output of i.fusion.temporal supposed to be, and what the input? What should the tool do.

Also, extending the API would be more justified if there is a path to benefit for the end users. The simplest will probably be a jupyter notebook example, a cli tool for model training more elaborated (if i understand the purpose of DataLoaders correctly).

It would be helpful to know which industry you have in mind and what the get out of it…

Having had a brief look at pytorch DataLoader documentation, I would like to encourage you to check out the concepts of imagery groups and semantic labels in GRASS and how the can be used to feed data to machinlearning models. See e.g.: [Feat] r.learn.ml2: Add support for semantic label · Issue #1259 · OSGeo/grass-addons · GitHub Maybe imagery groups (or a temporal list of imagery groups) could be translated to Datasets in pytorch?

Note that also “static” auxiliary data may be used in machine learning together with reference data of entirely different granularity (average backscatter over several overpasses) or previous observations of the same sensor… Therefore, I landed on compiling groups in a first step before feeding the data to models…

1 Like

Thank you for the push; it has clarified the pipeline well.

- So the core aim for i.fusion.temporal is to act as a “Temporal Gap-Filler”: it takes two native STRDS, a cloud-masked optical dataset like Sentinel-2 and a clear-sky SAR dataset like Sentinel-1, and generates a reconstructed, cloud-free STRDS.

Now; the output is immediately composable with t.rast.algebra and r.series (which I think is the whole point of staying inside GRASS).

For the industry focus: Environmental Monitoring and Disaster Response.

NVE’s Sentinel-1/3 workflows for avalanche and snow cover detection feel like the natural benchmark here, since the 40% cloud unusability problem is exactly what this pipeline targets.

On imagery groups and semantic labels: I’m adopting the two-step approach like the one you landed on in production.

Users compile groups upstream with i.group, and grass.ml.temporal consumes those clean groups. Semantic labels like “Red” and “VV” define tensor channel order automatically (as you suggested in Issue #1259), which makes models portable across datasets without the brittle manual reconfiguration currently required.

This also handles mixed granularity by broadcasting static layers like a DEM across every temporal step in the 4D tensor. The TGIS topology engine lets the pipeline handle arbitrary-length sequences too, which I think directly solves the bottleneck you mentioned with time series beyond two points in time.

To make the end-user path visible, I’m adding a Jupyter Notebook as a primary deliverable:

  • opening STRDS,

  • assembling semantic groups,

  • cross-sensor alignment,

  • streaming 4D tensors into PyTorch, the whole flow end to end.

I’ll share the revised proposal this afternoon .

Thanks Selma for the image. Graphics usually help. For me this one unfortunately does not add very much clarity or information beyond what already was said.

Is phase 2 aimed at model training? If a Pytorch model is the final out put that is what I would expect? And in phase 1 I get confused are Sentinel-1 and -2 data just examples, and if so examples for what? A tool that would generate imagery groups from STDS could be useful and somethng I can in my head connect with phase 2, but a “reconstructed cloud-free STRDS" is something I do strugle a bit with…

Cheers

Stefan

1 Like

@annakrat @sbl @echoix @Sefan

I have completed the full proposal draft based on the architectural feedback from this discussion. I would appreciate any final review before the submission deadline ^^

Hi Stefan,

The diagram caused confusion because it shows the phases as sequential when they’re actually layered. grass.ml.temporal is the core engine for both training and inference. i.fusion.temporal is a frontend module built on top of it.

( The arrow between phases represents using gap-filled output as training data, which is a valid but secondary workflow.)

Sentinel-1 and Sentinel-2 are examples only. Both tools are sensor-agnostic and work with any two complementary STRDS.

Your suggestion about automating imagery group generation from an STRDS directly is a good one and I will add it to the scope.

The full proposal with explicit I/O definitions is linked above. I would really value your feedback on it before the submission deadline.

Hello @sbl

I just opened a PR (here) for the temporal aggregation issue (Resolves #3042).

Please check it out when you have a second. If there are other interesting TGIS issues I should look into to get deeper into the internals and keep contributing, let me know ^^