[GRASS-dev] [GRASS GIS] #1628: segfault in r.walk

#1628: segfault in r.walk
----------------------+-----------------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.3
Component: Raster | Version: 6.4.2
Keywords: | Platform: Linux
      Cpu: x86-64 |
----------------------+-----------------------------------------------------
After calling r.walk like
{{{
  r.walk elevation=dem start_points=stream_points stop_points=stop_points
walk_coeff=1,10000,0,0 lambda=1.0 output=walk friction=friction
max_cost=1600000
}}}
while finding cost path, there is a segfault. Here it is with the
backtrace

{{{
(gdb) run
Starting program: /localdata/local/grass-6.5.svn/bin/r.walk elevation=dem
start_points=stream_points stop_points=stop_points walk_coeff=1,10000,0,0
lambda=1.0 output=walk friction=friction max_cost=1600000
[Thread debugging using libthread_db enabled]
Walking costs are a=1.000000 b=10000.000000 c=0.000000 d=0.000000
Lambda is 1.000000
Slope_factor is -0.212500
Nseg is 4
Null cells excluded from cost evaluation.
Dev note: Adapted sites library used for vector points. (module should be
updated to GRASS 6 vector library)
Detaching after fork from child process 27060.
Dev note: Adapted sites library used for vector points. (module should be
updated to GRASS 6 vector library)
Detaching after fork from child process 27061.
DTM_Source map is: Integer cell type
9194 rows, 8736 cols
COST_Source map is: Integer cell type
4536 rows, 2370 cols
Output map is: Integer cell type
4536 rows, 2370 cols
EW resolution 1000.00003956 (1000.000040)
NS resolution 999.99998622 (999.999986)
Creating some temporary files...
Reading dem...
Reading friction...
  100%
Initializing output
  100%
Finding cost path

Program received signal SIGSEGV, Segmentation fault.
get () at memory.c:111
111 if (first_free->lower == NULL) {
(gdb) bt
#0 get () at memory.c:111
#1 0x0000000000402724 in insert (min_cost=<value optimized out>, row=999,
col=1235)
     at btree.c:35
#2 0x00000000004054f1 in main (argc=347946718, argv=<value optimized
out>)
     at main.c:1452
}}}

It is not easy to reproduce. I posted a tarball to reproduce here:
https://poseidon.centre-cired.fr/pat/segfault_r_walk.tar.gz

To reproduce:
{{{
tar xzvf segfault_r_walk.tar.gz
cd segfault_r_walk
}}}
edit run.sh to have the right paths to grass and run
{{{
./run.sh.
}}}

I used v.in.ogr for the stop_points vector, as if I use what comes from
v.out.ascii (the corresponding line is commented out in run.sh) the bug
does not reproduce for me.

This bug is also in grass 6.4.2, with the exact same backtrace. It is
also present in 6.4.1.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/1628&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------+-----------------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.3
Component: Raster | Version: 6.4.2
Keywords: | Platform: Linux
      Cpu: x86-64 |
----------------------+-----------------------------------------------------

Comment(by mmetz):

I can reproduce the segfault with the test data. In GRASS 7, r.walk
finishes successfully.

Backport r.cost, r.walk to 6.x?

Markus M

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------
Changes (by neteler):

  * keywords: => r.cost, r.walk
  * version: 6.4.2 => svn-releasebranch64
  * milestone: 6.4.3 => 6.4.4

Comment:

Replying to [comment:1 mmetz]:
> I can reproduce the segfault with the test data. In GRASS 7, r.walk
finishes successfully.
>
> Backport r.cost, r.walk to 6.x?

Test data are still generating a crash in the test script with current
6.4.svn.

Which fix needs to be backported?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by mlennert):

I tried with the test script and can confirm the crash. Testing a bit
further the difference between the v.in.ascii and v.in.ogr imports, I saw
that the latter has many more attribute columns. When I reduce the number
of attribute columns, I get r.walk to work:

{{{
v.info -c stop_points_ogr
Displaying column types/names for database connection of layer 1:
INTEGER|cat
INTEGER|cat_point
INTEGER|cat_relocated
DOUBLE PRECISION|lon_relocated
DOUBLE PRECISION|lat_relocated
INTEGER|cat_
DOUBLE PRECISION|value
DOUBLE PRECISION|orig_lon
DOUBLE PRECISION|orig_lat
}}}

Running v.db.droptable before r.walk also makes r.walk work.

I do not have the time right now to check if this is linked to a specific
attribute or just the general number of attributes, but maybe someone else
can continue on that path ?

Backporting from grass7 seems quite invasive at this stage, unless the two
modules are self-contained.

Moritz

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by mlennert):

Additional info: AFAICT, the segfault occurs in a call to get() (memory.c)
within the insert() function (btree.c).

Moritz

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by glynn):

Replying to [comment:3 mlennert]:

> I do not have the time right now to check if this is linked to a
specific attribute or just the general number of attributes, but maybe
someone else can continue on that path ?

Could it be a problem with lib/sites?

The 6.x version of r.walk uses that library to obtain the start/stop
points from a vector map using the 5.x sites API. The 7.x version uses the
vector library directly.

In 7.x, the only clients of lib/sites are v.in.sites and v.vol.rst. The
former only uses the G_oldsite_* functions to access genuine 5.x "sites"
maps. The latter uses G_site_new_struct() to allocate a structure which is
never used.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by neteler):

Replying to [comment:5 glynn]:
...
> In 7.x, the only clients of lib/sites are v.in.sites and v.vol.rst. The
former only uses the G_oldsite_* functions to access genuine 5.x "sites"
maps. The latter uses G_site_new_struct() to allocate a structure which is
never used.

(Unrelated to this ticket but to above comment):
I have removed the lib/sites dependency of v.vol.rst in 7.x in r60272
(trunk) and r60273 (relbr7).

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>

GRASS GIS wrote:

I have removed the lib/sites dependency of v.vol.rst in 7.x in r60272
(trunk) and r60273 (relbr7).

That appears to leave v.in.sites as the only client of lib/sites. And
it only uses the "oldsite" functionality (i.e. reading GRASS 5.x
"sites" maps), as opposed to using the legacy sites API to access
vector maps):

grass=> SELECT i.symbol, i.object
grass-> FROM obj_exp e, obj_imp i
grass-> WHERE e.object LIKE 'lib/sites/%' AND e.symbol = i.symbol ;
        symbol | object
---------------------+-------------------------------------------------------
  G_site_new_struct | vector/v.in.sites/OBJ.x86_64-unknown-linux-gnu/main.o
  G_site_free_struct | vector/v.in.sites/OBJ.x86_64-unknown-linux-gnu/main.o
  G_oldsite_describe | vector/v.in.sites/OBJ.x86_64-unknown-linux-gnu/main.o
  G_oldsite_get | vector/v.in.sites/OBJ.x86_64-unknown-linux-gnu/main.o
  G_oldsites_open_old | vector/v.in.sites/OBJ.x86_64-unknown-linux-gnu/main.o
(5 rows)

Is there any reason to retain lib/sites as a separate library, rather
than simply merging it into v.in.sites? There isn't a ctypes wrapper
for it, so I'm reasonably sure it isn't used elsewhere in GRASS.

[Radim: does QGIS use it?]

--
Glynn Clements <glynn@gclements.plus.com>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by mlennert):

Replying to [comment:5 glynn]:
> Replying to [comment:3 mlennert]:
>
> > I do not have the time right now to check if this is linked to a
specific attribute or just the general number of attributes, but maybe
someone else can continue on that path ?
>
> Could it be a problem with lib/sites?

It would seem a likely candidate, but how would that influence what is
happening a long time after the closure of the input file ? The segfault
occurs in the routine finding the cost path in which, AFAICT, the
lib/sites doesn't play a role anymore.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:7&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by mmetz):

Replying to [comment:7 mlennert]:
> Replying to [comment:5 glynn]:
> > Replying to [comment:3 mlennert]:
> >
> > > I do not have the time right now to check if this is linked to a
specific attribute or just the general number of attributes, but maybe
someone else can continue on that path ?
> >
> > Could it be a problem with lib/sites?
>
> It would seem a likely candidate, but how would that influence what is
happening a long time after the closure of the input file ? The segfault
occurs in the routine finding the cost path in which, AFAICT, the
lib/sites doesn't play a role anymore.

This can happen if the sites lib corrupts some memory that is not used by
the sites lib itself. The segmentation fault can then appear anywhere else
later on. You need to use valgrind to check memory usage and search for
something like "invalid [read|write] ..." in the valgrind output, usually
rather at the beginning than at the end of the valgrind output.

Apart from that, the usage of the wrappers provided by the sites lib is
deprecated also in G6. I would suggest the replace the relevant code in
the affected modules rather than fixing the sites lib.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:8&gt;
GRASS GIS <http://grass.osgeo.org>

#1628: segfault in r.walk
----------------------------+-----------------------------------------------
Reporter: pertusus | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.4
Component: Raster | Version: svn-releasebranch64
Keywords: r.cost, r.walk | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by mlennert):

Replying to [comment:8 mmetz]:
> Replying to [comment:7 mlennert]:
> > Replying to [comment:5 glynn]:
> > > Replying to [comment:3 mlennert]:
> > >
> > > > I do not have the time right now to check if this is linked to a
specific attribute or just the general number of attributes, but maybe
someone else can continue on that path ?
> > >
> > > Could it be a problem with lib/sites?
> >
> > It would seem a likely candidate, but how would that influence what is
happening a long time after the closure of the input file ? The segfault
occurs in the routine finding the cost path in which, AFAICT, the
lib/sites doesn't play a role anymore.
>
> This can happen if the sites lib corrupts some memory that is not used
by the sites lib itself. The segmentation fault can then appear anywhere
else later on. You need to use valgrind to check memory usage and search
for something like "invalid [read|write] ..." in the valgrind output,
usually rather at the beginning than at the end of the valgrind output.

I won't be able to work on this before next week, so if someone else wants
to try...

>
> Apart from that, the usage of the wrappers provided by the sites lib is
deprecated also in G6. I would suggest the replace the relevant code in
the affected modules rather than fixing the sites lib.

This sounds fairly invasive. Is this still feasible for 6.4.4 ?

Moritz

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/1628#comment:9&gt;
GRASS GIS <http://grass.osgeo.org>

Sorry for not responding earlier, I missed somehow this mail.

On Sat, May 17, 2014 at 7:33 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Is there any reason to retain lib/sites as a separate library, rather
than simply merging it into v.in.sites? There isn't a ctypes wrapper
for it, so I'm reasonably sure it isn't used elsewhere in GRASS.

[Radim: does QGIS use it?]

No, QGIS does not use lib/sites.

I appreciate a lot that you asked me and I'll always do in such cases.
I probably missed the email because I was 'cc' only and not 'to'.

Thanks
Radim