[GRASS-dev] Debugging, parallelism, etc.

Hello,

I'm working on adding parallelism to modules, but debugging is turning out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside GRASS (I assume due to shell?). The gmon.out file produced has no usable data.

OpenMP is extremely poorly supported by most tools. valgrind with helgrind reports a lot of nonsense. I can't seem to get the Intel linux tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn't an issue in most cases, there can be races and deadlocks if not handled properly. Pthreads aren't entirely portable. OpenMP is. However, pthreads gives us a more control. May I suggest using OpenMP for most modules and reserve Pthreads to libraries, etc? Or should we start moving away from pthreads?

Any suggestions would be greatly appreciated!

--
Best Regards,
-Brad

Can still run GRASS outside the shell by setting all of the environment variables appropriately ...

OpenMP just works by "unrolling" all of the determinate loops, i.e., the ones that iterate a fixed number of times. No speedups to anything else.

Speedup from OpenMP will be limited, depending on the number of determinate loops present, and how much of the load they represent.

pthreads are totally flexible, but the programmer has to specify everything, very carefully ...

But pthreads can speed up lots of stuff outside of determinate loops ...

HTH,

Bill H.

On 10/9/2022 12:37 PM, Brad ReDacted wrote:

Hello,

I'm working on adding parallelism to modules, but debugging is turning out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside GRASS (I assume due to shell?). The gmon.out file produced has no usable data.

OpenMP is extremely poorly supported by most tools. valgrind with helgrind reports a lot of nonsense. I can't seem to get the Intel linux tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn't an issue in most cases, there can be races and deadlocks if not handled properly. Pthreads aren't entirely portable. OpenMP is. However, pthreads gives us a more control. May I suggest using OpenMP for most modules and reserve Pthreads to libraries, etc? Or should we start moving away from pthreads?

Any suggestions would be greatly appreciated!

--
William W. Hargrove
Eastern Forest Environmental Threat Assessment Center
USDA Forest Service
Southern Research Station
200 WT Weaver Boulevard
Asheville, NC 28804-3454

(828) 257-4846
(865) 235-4753 (cell)
(828) 257-4313 (fax)
hnw@geobabble.org
william.w.hargrove@usda.gov
http://www.geobabble.org/~hnw

Those variables would be...? Is this documented somewhere? If not, can we get it documented?

OpenMP can do far more than just loop "unrolling", these days. Tasking sections to run concurrent is also quite trivial. It can also offload to GPU, etc. Check out the v4.5+ spec. It's pretty impressive. I believe it can do most of what pthreads does, but you certainly lose control of implementation details. Some compilers have an omp library while others convert to pthreads.

I do find myself rewriting algorithms so that OpenMP can handle them. It doesn't seem to handle nested loops with breaks very well and I'm not entirely sure why.

On 10/9/2022 10:43 AM, William Hargrove wrote:

Can still run GRASS outside the shell by setting all of the environment variables appropriately ...

OpenMP just works by "unrolling" all of the determinate loops, i.e., the ones that iterate a fixed number of times. No speedups to anything else.

Speedup from OpenMP will be limited, depending on the number of determinate loops present, and how much of the load they represent.

pthreads are totally flexible, but the programmer has to specify everything, very carefully ...

But pthreads can speed up lots of stuff outside of determinate loops ...

HTH,

Bill H.

On 10/9/2022 12:37 PM, Brad ReDacted wrote:

Hello,

I'm working on adding parallelism to modules, but debugging is turning out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside GRASS (I assume due to shell?). The gmon.out file produced has no usable data.

OpenMP is extremely poorly supported by most tools. valgrind with helgrind reports a lot of nonsense. I can't seem to get the Intel linux tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn't an issue in most cases, there can be races and deadlocks if not handled properly. Pthreads aren't entirely portable. OpenMP is. However, pthreads gives us a more control. May I suggest using OpenMP for most modules and reserve Pthreads to libraries, etc? Or should we start moving away from pthreads?

Any suggestions would be greatly appreciated!

--
Best Regards,
-Brad

There is no issue with supporting both OpenMP and pthreads as most of
libraries use neither of them. There are a few modules with some
parallelism implemented and in such case they use only one of options
thus bypassing any compatibility issues per se.

As for valgrind noise – it comes from design decisions made decades a
go – each module is a short running independent program and thus it is
left to OS to reclaim memory at exit. Analysis tools sometimes also
report potential uninitialized use but in cases that can not be
reached during a normal GRASS module run. Unfortunately improving
GRASS quite often is like restoring an ancient artefact where it is
hard to tell bugs from features apart.

Māris.

svētd., 2022. g. 9. okt., plkst. 19:37 — lietotājs Brad ReDacted
(<brad.redacted@outlook.com>) rakstīja:

Hello,

I'm working on adding parallelism to modules, but debugging is turning
out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside
GRASS (I assume due to shell?). The gmon.out file produced has no usable
data.

OpenMP is extremely poorly supported by most tools. valgrind with
helgrind reports a lot of nonsense. I can't seem to get the Intel linux
tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn't an
issue in most cases, there can be races and deadlocks if not handled
properly. Pthreads aren't entirely portable. OpenMP is. However,
pthreads gives us a more control. May I suggest using OpenMP for most
modules and reserve Pthreads to libraries, etc? Or should we start
moving away from pthreads?

Any suggestions would be greatly appreciated!

--
Best Regards,
-Brad

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

On 10/9/2022 11:25 PM, Maris Nartiss wrote:

There is no issue with supporting both OpenMP and pthreads as most of
libraries use neither of them. There are a few modules with some
parallelism implemented and in such case they use only one of options
thus bypassing any compatibility issues per se.

This is true today. Will it be true tomorrow? Some guidelines are in order.

As for valgrind noise – it comes from design decisions made decades a
go – each module is a short running independent program and thus it is
left to OS to reclaim memory at exit. Analysis tools sometimes also
report potential uninitialized use but in cases that can not be
reached during a normal GRASS module run. Unfortunately improving
GRASS quite often is like restoring an ancient artefact where it is
hard to tell bugs from features apart.

Valgrind's issue is it doesn't support OpenMP without tricking GCC into converting to pthreads, which completely defeats the point.

--

Best Regards,
-Brad

Am 09.10.2022 20:45 schrieb Brad ReDacted:

Those variables would be...? Is this documented somewhere? If not, can
we get it documented?

https://grasswiki.osgeo.org/wiki/Working_with_GRASS_without_starting_it_explicitly

OpenMP can do far more than just loop "unrolling", these days. Tasking
sections to run concurrent is also quite trivial. It can also offload
to GPU, etc. Check out the v4.5+ spec. It's pretty impressive. I
believe it can do most of what pthreads does, but you certainly lose
control of implementation details. Some compilers have an omp library
while others convert to pthreads.

I do find myself rewriting algorithms so that OpenMP can handle them.
It doesn't seem to handle nested loops with breaks very well and I'm
not entirely sure why.

On 10/9/2022 10:43 AM, William Hargrove wrote:

Can still run GRASS outside the shell by setting all of the environment variables appropriately ...

OpenMP just works by "unrolling" all of the determinate loops, i.e., the ones that iterate a fixed number of times. No speedups to anything else.

Speedup from OpenMP will be limited, depending on the number of determinate loops present, and how much of the load they represent.

pthreads are totally flexible, but the programmer has to specify everything, very carefully ...

But pthreads can speed up lots of stuff outside of determinate loops ...

HTH,

Bill H.

On 10/9/2022 12:37 PM, Brad ReDacted wrote:

Hello,

I'm working on adding parallelism to modules, but debugging is turning out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside GRASS (I assume due to shell?). The gmon.out file produced has no usable data.

OpenMP is extremely poorly supported by most tools. valgrind with helgrind reports a lot of nonsense. I can't seem to get the Intel linux tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn't an issue in most cases, there can be races and deadlocks if not handled properly. Pthreads aren't entirely portable. OpenMP is. However, pthreads gives us a more control. May I suggest using OpenMP for most modules and reserve Pthreads to libraries, etc? Or should we start moving away from pthreads?

Any suggestions would be greatly appreciated!

--
Best Regards,
-Brad

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

Hi,

just wanted to point that if you are interested in a “framework” for submit jobs to a thread pool, I can point to

https://github.com/uclouvain/openjpeg/blob/master/src/lib/openjp2/thread.h

https://github.com/uclouvain/openjpeg/blob/master/src/lib/openjp2/thread.c

which is a port in C I’ve done from the equivalent C++ code of GDAL (https://github.com/OSGeo/gdal/blob/master/port/cpl_worker_thread_pool.h).

It has a pthread and Win32 implementation. It could be easily extracted from libopenjp2 (pending a opj_ → grass_ renaming to avoid conflicts if both are combined)

The high level API is the opj_thread_pool_* one.

Probably not super fancy, but serves my need well. The user is responsible for selecting the number of threads and splitting the work load in jobs that are queued to the thread pool and consumed by the threads as soon as they are no longer busy.

Even

···

Le 14/10/2022 à 11:02, Moritz Lennert a écrit :

Am 09.10.2022 20:45 schrieb Brad ReDacted:

Those variables would be…? Is this documented somewhere? If not, can
we get it documented?

https://grasswiki.osgeo.org/wiki/Working_with_GRASS_without_starting_it_explicitly

OpenMP can do far more than just loop “unrolling”, these days. Tasking
sections to run concurrent is also quite trivial. It can also offload
to GPU, etc. Check out the v4.5+ spec. It’s pretty impressive. I
believe it can do most of what pthreads does, but you certainly lose
control of implementation details. Some compilers have an omp library
while others convert to pthreads.

I do find myself rewriting algorithms so that OpenMP can handle them.
It doesn’t seem to handle nested loops with breaks very well and I’m
not entirely sure why.

On 10/9/2022 10:43 AM, William Hargrove wrote:

Can still run GRASS outside the shell by setting all of the environment variables appropriately …

OpenMP just works by “unrolling” all of the determinate loops, i.e., the ones that iterate a fixed number of times. No speedups to anything else.

Speedup from OpenMP will be limited, depending on the number of determinate loops present, and how much of the load they represent.

pthreads are totally flexible, but the programmer has to specify everything, very carefully …

But pthreads can speed up lots of stuff outside of determinate loops …

HTH,

Bill H.

On 10/9/2022 12:37 PM, Brad ReDacted wrote:

Hello,

I’m working on adding parallelism to modules, but debugging is turning out to be a logistical nightmare:

Why do I not get any reporting from GCC option '-fsanitize=address|thread"?

I am also having trouble getting the profiler to work properly inside GRASS (I assume due to shell?). The gmon.out file produced has no usable data.

OpenMP is extremely poorly supported by most tools. valgrind with helgrind reports a lot of nonsense. I can’t seem to get the Intel linux tools to work properly, either.

BTW, we are supporting both pthreads and OpenMP. While this isn’t an issue in most cases, there can be races and deadlocks if not handled properly. Pthreads aren’t entirely portable. OpenMP is. However, pthreads gives us a more control. May I suggest using OpenMP for most modules and reserve Pthreads to libraries, etc? Or should we start moving away from pthreads?

Any suggestions would be greatly appreciated!


Best Regards,
-Brad


grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev


grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

-- 
[http://www.spatialys.com](http://www.spatialys.com)
My software is free, but my time generally not.

Thank you for the pointers and links. I will definitely take a look at it and see what I can incorporate. Between this and the various drivers, I think I have more than enough to work on for awhile.

Thank you all for indulging and tolerating my questions.

···
-- 
Best Regards,
-Brad