Following up on my earlier post about parallelizing r.proj. I have a working proof-of-concept and a draft PR ready.
What I built:
The readcell tile cache is not thread-safe, so rather than wrapping it in mutexes (which I prototyped first, it just serializes everything and gives no speedup), I pre-load the input raster into a flat contiguous RAM buffer before the parallel loop. Reads from a flat array are thread-safe by nature, so no locks are needed during projection.
Benchmarks (10,000 × 10,000 raster, Apple M4, 8 cores):
- Serial: ~20.3s, 83% CPU
- Parallel (UTM → WebMercator): ~8.1s, 515% CPU
- Parallel (UTM → Lat/Lon): 9.1s, 572% CPU
around 2.5x wall-time speedup across two independent projection pipelines.
I’ve added a memory safety check that warns users if the buffer would exceed 4GB and points them to g.region to reduce it.
Draft PR: [GSoC 2026 Draft POC] r.proj: OpenMP parallelization via RAM-resident buffer by krcoder123 · Pull Request #7185 · OSGeo/grass · GitHub
Two questions for mentors:
- Is the RAM-resident approach acceptable long-term, or are thread-local caches strongly preferred for the final implementation?
- Are there thread-safety concerns in GPJ_transform beyond PJ_CONTEXT that I should audit before the proposal deadline?
Thanks,
Kaushik Raja
GitHub: krcoder123