[GRASS-user] r.fillnulls hang on one computer, not on another

Hello,

I'm trying to run 'r.fillnulls' and it hangs on one computer but not on another.

On my laptop running Ubuntu 20.04 it runs fine in GRASS 7.8.6. It also runs in a docker container running GRASS 7.8.4.

On our server it just hangs. This is running Ubuntu 18.04 and grass 7.8.2. It also hangs running *the same docker image* Ubuntu 20.04 GRASS 7.8.4

Do you have any suggestions how to debug this? Examining it with top sows it is churning away with 100 % CPU usage. It takes ~5 minutes on my laptop then completes. It never completes on the server (which is much more powerful - it should complete faster).

Thank you,

  -k.

Hi Ken,

On Tue, Jan 18, 2022 at 9:23 PM Ken Mankoff <mankoff@gmail.com> wrote:

Hello,

I'm trying to run 'r.fillnulls' and it hangs on one computer but not on another.

On my laptop running Ubuntu 20.04 it runs fine in GRASS 7.8.6. It also runs in a docker container running GRASS 7.8.4.

On our server it just hangs. This is running Ubuntu 18.04 and grass 7.8.2. It also hangs running *the same docker image* Ubuntu 20.04 GRASS 7.8.4

Do you have any suggestions how to debug this?

You can find out the process ID (the columns are
USER PID %CPU %MEM ...
so that it is the number after the user name):

ps aux | grep r.fillnulls

and then run strace on it

strace -p <pid>

The result may be hard to read yet give some insights if it is waiting
for something (e.g., a device) on that machine.

Markus

Examining it with top sows it is churning away with 100 % CPU usage. It takes ~5 minutes on my laptop then completes. It never completes on the server (which is much more powerful - it should complete faster).

Thank you,

  -k.
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Hi Markus,

On 2022-01-18 at 12:28 -08, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Jan 18, 2022 at 9:23 PM Ken Mankoff <mankoff@gmail.com> wrote:

I'm trying to run 'r.fillnulls' and it hangs on one computer but not on another.

and then run strace on it

strace -p <pid>

strace prints out things like this (at about 5 Hz):

futex(0x55591e4fad24, FUTEX_WAIT_PRIVATE, 4832, NULL) = 0
futex(0x55591f6fcde4, FUTEX_WAKE_PRIVATE, 2147483647) = 16
futex(0x55591f6fcde4, FUTEX_WAIT_PRIVATE, 4928, NULL) = 0
futex(0x55591e4fad24, FUTEX_WAKE_PRIVATE, 2147483647) = 27
futex(0x55591f6fcde4, FUTEX_WAIT_PRIVATE, 4936, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x55591f6fcde4, FUTEX_WAIT_PRIVATE, 4952, NULL) = 0
futex(0x55591e4fad24, FUTEX_WAKE_PRIVATE, 2147483647) = 35
futex(0x55591f6fcde4, FUTEX_WAIT_PRIVATE, 4960, NULL) = 0

I'm out of my depth here. It seems like a similar issue elsewhere may point to SSL library issues or that this is a parent thread waiting on another thread. From https://meenakshi02.wordpress.com/2011/02/02/strace-hanging-at-futex/ and 'ps -efL | grep filln', there is only one other PID. If I strace that, all I see is

read(3,

And nothing else...

  -k.

Hi Markus,

I'm out of my depth here. It seems like a similar issue elsewhere may
point to SSL library issues or that this is a parent thread waiting on
another thread. From
https://meenakshi02.wordpress.com/2011/02/02/strace-hanging-at-futex/
and 'ps -efL | grep filln', there is only one other PID. If I strace
that, all I see is

read(3,

It grows!!

After a while I'm now seeing

read(3, " 10%\10\10\10\10\10", 6410) = 10
read(3, " 20%\10\10\10\10\10", 6400) = 10
read(3, " 30%\10\10\10\10\10", 6390) = 10
read(3, " 40%\10\10\10\10\10", 6380) = 10

So maybe it'll finish... eventually?

  -k.

At the back r.fillnulls is calling other interpolation modules. Some
of them can run in parallel but they are using implicit approach in
selection of threads to run by detecting number of CPUs. My guess –
for some reason this magick code gets CPU count wrong and thus does
not utilize all CPUs available.

Māris.

otrd., 2022. g. 18. janv., plkst. 23:20 — lietotājs Ken Mankoff
(<mankoff@gmail.com>) rakstīja:

Hi Markus,

> I'm out of my depth here. It seems like a similar issue elsewhere may
> point to SSL library issues or that this is a parent thread waiting on
> another thread. From
> https://meenakshi02.wordpress.com/2011/02/02/strace-hanging-at-futex/
> and 'ps -efL | grep filln', there is only one other PID. If I strace
> that, all I see is
>
> read(3,

It grows!!

After a while I'm now seeing

read(3, " 10%\10\10\10\10\10", 6410) = 10
read(3, " 20%\10\10\10\10\10", 6400) = 10
read(3, " 30%\10\10\10\10\10", 6390) = 10
read(3, " 40%\10\10\10\10\10", 6380) = 10

So maybe it'll finish... eventually?

  -k.
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user