My name is Vinay, a student at Scaler School of Technology. I am interested in applying for GSoC 2026.
I noticed that r.resamp.stats is currently single-threaded, which creates a bottleneck for large datasets.
As a proof-of-concept, I implemented an OpenMP-based parallel version locally. By optimizing the memory allocation strategy (removing internal G_malloc locks during parallel regions), I achieved a 3x speedup on a Median calculation benchmark (30k x 30k pixel map).
Benchmark Results (Median Method):
Serial: 47.49s
Parallel (12 Threads): 15.64s
I would like to propose a GSoC project to fully implement, test, and merge this optimization for all aggregation methods in r.resamp.stats.
Would this be a suitable project for this year’s roadmap? I would love your feedback.
Thanks! Could you make a PR? You can open it as a draft, but we need to see your implementation, just describe any limitations of your solution. There is a benchmarking library you can use to show the speedup for different number of cores, see example in r.mapcalc tool.
For GSoC, I think we would like to see more than one tool parallelized, of course depends on the complexity.
I will clean up the code and open a Draft PR shortly. This will allow you to review the specific changes I made to the memory allocation strategy to bypass the locks.
I will look for the r.mapcalc benchmarking examples in the codebase and use that format to validate the scaling.
That makes perfect sense. I view r.resamp.stats as the “proof of concept” for my proposal. I plan to identify 2-3 additional modules to parallelize for the final 3-month timeline. I noticed you mentioned r.neighbors in the other thread—I will investigate that and r.univar as likely candidates to include in the full proposal.
Limitations: As noted in the PR description, this prototype currently only implements the average and median methods. I also bypassed G_malloc for standard malloc in the parallel regions to avoid locking, which will need a safety review.
Benchmarks: For this initial prototype, I used manual timings (screenshots attached in PR). I will look into the r.mapcalc benchmarking library next and update the PR with proper automated benchmarks as you suggested.
I also agree with expanding the GSoC scope. I will start investigating r.neighbors as the next candidate for parallelization to include in my full proposal.