[GRASS-dev] d.vect kills d.mon

Hi,

mysterious day: I used v.clean on a map to break/snap/rmdupl
and a topologically correct map is generated. But looking at
it, d.vect kills x0:

GRASS 6.3.cvs (spearfish60):~ > v.build myroads_net
Building topology ...
948 primitives registered
Building areas: 100%
0 areas built
0 isles built
Attaching islands:
Attaching centroids: 100%
Topology was built.
Number of nodes : 683
Number of primitives: 948
Number of points : 2
Number of lines : 946
Number of boundaries: 0
Number of centroids : 0
Number of areas : 0
Number of isles : 0

GRASS 6.3.cvs (spearfish60):~ > d.mon x0
using default visual which is TrueColor
ncolors: 16777216
Graphics driver [x0] started

GRASS 6.3.cvs (spearfish60):~ > d.vect myroads_net col=red
ERROR eof from graphics driver.

I did "make distclean; ..." before.

strace d.vect myroads_net col=red
...
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
read(6, "\252\327\353\307\353>\"A\274\222\250\200\301\311RA\v\2"..., 4096) = 139
_llseek(6, 147595, [147595], SEEK_SET) = 0
_llseek(6, 147595, [147595], SEEK_SET) = 0
_llseek(6, 147595, [147595], SEEK_SET) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
write(4, "\0\0\177\6\377\0\0\177\6\377\0\0\177\6\377\0\0\177\6\377"..., 1484) = 1484
read(5, "", 1) = 0
write(2, "ERROR eof from graphics driver.\n", 32ERROR eof from graphics driver.
) = 32
exit_group(1) = ?
Process 27067 detached

Not sure where "ERROR eof from graphics driver" originates from, probably
lib/raster/rem_io.c?

Markus

Markus

Markus Neteler wrote:

mysterious day: I used v.clean on a map to break/snap/rmdupl
and a topologically correct map is generated. But looking at
it, d.vect kills x0:

Hamish made some changes to icon plotting in d.vect yesterday, so
those would be the prime suspect:

  RCS file: /grassrepository/grass6/display/d.vect/plot1.c,v
  Working file: plot1.c
  head: 1.32
  branch:
  locks: strict
  access list:
  keyword substitution: kv
  total revisions: 33; selected revisions: 3
  description:
  ----------------------------
  revision 1.32
  date: 2007/05/02 10:42:34; author: hamish; state: Exp; lines: +2 -1
  centroids always use default color to stand out from underlying area
  ----------------------------
  revision 1.31
  date: 2007/05/02 10:04:39; author: hamish; state: Exp; lines: +68 -125
  bugfix: wasn't calculating new x,y for icon if colors were off (potentially nasty)
  simplification: use D_symbol() to plot symbols
  speed: only plot symbols which are in the graphics frame (massive speedup for LIDAR)
  readability: change "rgb" variable name and set it using boolean values
  ----------------------------
  revision 1.20.4.1
  date: 2007/05/02 10:13:51; author: hamish; state: Exp; lines: +4 -3
  bugfix: wasn't calculating new x,y for icon if colors were offr
     (potentially nasty, probably never triggered)

strace d.vect myroads_net col=red
...
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
_llseek(6, 147456, [147456], SEEK_SET) = 0
read(6, "\252\327\353\307\353>\"A\274\222\250\200\301\311RA\v\2"..., 4096) = 139
_llseek(6, 147595, [147595], SEEK_SET) = 0
_llseek(6, 147595, [147595], SEEK_SET) = 0
_llseek(6, 147595, [147595], SEEK_SET) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
write(4, "\0\0\177\6\377\0\0\177\6\377\0\0\177\6\377\0\0\177\6\377"..., 1484) = 1484
read(5, "", 1) = 0
write(2, "ERROR eof from graphics driver.\n", 32ERROR eof from graphics driver.
) = 32
exit_group(1) = ?
Process 27067 detached

Not sure where "ERROR eof from graphics driver" originates from, probably
lib/raster/rem_io.c?

Yes.

Debugging display commands is easier using direct rendering, as the
task isn't split between multiple processes (the client and the
display driver). E.g.:

  $ GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect
  > run myroads_net col=red

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, May 03, 2007 at 01:22:01AM +0100, Glynn Clements wrote:

Markus Neteler wrote:

> mysterious day: I used v.clean on a map to break/snap/rmdupl
> and a topologically correct map is generated. But looking at
> it, d.vect kills x0:

Hamish made some changes to icon plotting in d.vect yesterday, so
those would be the prime suspect:

...

> Not sure where "ERROR eof from graphics driver" originates from, probably
> lib/raster/rem_io.c?

Yes.

Debugging display commands is easier using direct rendering, as the
task isn't split between multiple processes (the client and the
display driver). E.g.:

  $ GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect
  > run myroads_net col=red

Ah, nice. So:

GRASS 6.3.cvs (spearfish60):~/grass63/scripts > d.mon x0

GRASS 6.3.cvs (spearfish60):~/grass63/scripts > GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect
GNU gdb 6.3-8mdv2007.0 (Mandriva Linux release 2007.0)
...
(gdb) run myroads_net col=red
Starting program: /home/neteler/soft/63grass_cvsexp/dist.i686-pc-linux-gnu/bin/d.vect myroads_net col=red
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xbfffe000
[Thread debugging using libthread_db enabled]
[New Thread -1225832752 (LWP 14496)]
PNG: GRASS_TRUECOLOR status: FALSE

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1225832752 (LWP 14496)]
0xb77e42d8 in fputs () from /lib/i686/libc.so.6
(gdb) bt
#0 0xb77e42d8 in fputs () from /lib/i686/libc.so.6
#1 0xb7ec37a0 in print_error (
    msg=0xbfde35a0 "PNG: collecting to map.png,\n GRASS_WIDTH=640, GRASS_HEIGHT=480", type=0)
    at error.c:276
#2 0xb7ec3ca5 in G_message (
    msg=0xb7e637c8 "PNG: collecting to %s,\n GRASS_WIDTH=%d, GRASS_HEIGHT=%d") at error.c:112
#3 0xb7e61c62 in PNG_Graph_set (argc=0, argv=0x0) at Graph_set.c:133
#4 0xb7e57351 in COM_Graph_set (argc=0, argv=0x0) at Graph.c:7
#5 0xb7e5869e in LIB_init (drv=0xb7e64f40, argc=0, argv=0x0) at init.c:79
#6 0xb7e6ad92 in LOC_open_driver () at loc_io.c:67
#7 0xb7e6a014 in R_open_driver () at com_io.c:180
#8 0x0804d7f5 in main (argc=3, argv=0xbfde6774) at main.c:375

(gdb) bt full
#0 0xb77e42d8 in fputs () from /lib/i686/libc.so.6
No symbol table info available.
#1 0xb7ec37a0 in print_error (
    msg=0xbfde35a0 "PNG: collecting to map.png,\n GRASS_WIDTH=640, GRASS_HEIGHT=480", type=0)
    at error.c:276
        w = Variable "w" is not available.

Hopefully this indicates the problem,
Markus

Markus Neteler wrote:

GRASS 6.3.cvs (spearfish60):~/grass63/scripts > GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect
GNU gdb 6.3-8mdv2007.0 (Mandriva Linux release 2007.0)
...
(gdb) run myroads_net col=red
Starting program: /home/neteler/soft/63grass_cvsexp/dist.i686-pc-linux-gnu/bin/d.vect myroads_net col=red
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xbfffe000
[Thread debugging using libthread_db enabled]
[New Thread -1225832752 (LWP 14496)]
PNG: GRASS_TRUECOLOR status: FALSE

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1225832752 (LWP 14496)]
0xb77e42d8 in fputs () from /lib/i686/libc.so.6
(gdb) bt
#0 0xb77e42d8 in fputs () from /lib/i686/libc.so.6
#1 0xb7ec37a0 in print_error (
    msg=0xbfde35a0 "PNG: collecting to map.png,\n GRASS_WIDTH=640, GRASS_HEIGHT=480", type=0)
    at error.c:276
#2 0xb7ec3ca5 in G_message (
    msg=0xb7e637c8 "PNG: collecting to %s,\n GRASS_WIDTH=%d, GRASS_HEIGHT=%d") at error.c:112
#3 0xb7e61c62 in PNG_Graph_set (argc=0, argv=0x0) at Graph_set.c:133
#4 0xb7e57351 in COM_Graph_set (argc=0, argv=0x0) at Graph.c:7
#5 0xb7e5869e in LIB_init (drv=0xb7e64f40, argc=0, argv=0x0) at init.c:79
#6 0xb7e6ad92 in LOC_open_driver () at loc_io.c:67
#7 0xb7e6a014 in R_open_driver () at com_io.c:180
#8 0x0804d7f5 in main (argc=3, argv=0xbfde6774) at main.c:375

#2 0xb7ec3ca5 in G_message (
    msg=0xb7e637c8 "PNG: collecting to %s,\n GRASS_WIDTH=%d, GRASS_HEIGHT=%d") at error.c:112

Huh? Where did the "file:" go?

#3 0xb7e61c62 in PNG_Graph_set (argc=0, argv=0x0) at Graph_set.c:133

133 G_message("PNG: collecting to file: %s,\n GRASS_WIDTH=%d, GRASS_HEIGHT=%d",
134 file_name, width, height);

It may be that this is just a problem with the way that you copied the
output into the message (in which case, use "script" and a text editor
and attach the output; if you're posting diagnostic output, it needs
to be *exact*).

Apart from that:

#0 0xb77e42d8 in fputs () from /lib/i686/libc.so.6
#1 0xb7ec37a0 in print_error (
    msg=0xbfde35a0 "PNG: collecting to map.png,\n GRASS_WIDTH=640, GRASS_HEIGHT=480", type=0)
    at error.c:276

276 fprintf(stderr,"%s", prefix_std[type] );

I can't think of anything except for memory corruption. But:

#3 0xb7e61c62 in PNG_Graph_set (argc=0, argv=0x0) at Graph_set.c:133
#4 0xb7e57351 in COM_Graph_set (argc=0, argv=0x0) at Graph.c:7
#5 0xb7e5869e in LIB_init (drv=0xb7e64f40, argc=0, argv=0x0) at init.c:79
#6 0xb7e6ad92 in LOC_open_driver () at loc_io.c:67
#7 0xb7e6a014 in R_open_driver () at com_io.c:180
#8 0x0804d7f5 in main (argc=3, argv=0xbfde6774) at main.c:375

375 if (R_open_driver() != 0)

This is still really early, not long after G_parser() has returned. I
can't see how anything in the actual d.vect code could cause this.
And, if it's a problem with the display architecture, I would expect
it to affect a lot more than just d.vect.

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, May 03, 2007 at 09:09:03AM +0100, Glynn Clements wrote:

Markus Neteler wrote:

> GRASS 6.3.cvs (spearfish60):~/grass63/scripts > GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect
> GNU gdb 6.3-8mdv2007.0 (Mandriva Linux release 2007.0)
> ...
> (gdb) run myroads_net col=red

Richt, copy-paste fails here.
Using "script" now, see attached dvect_debug.txt.

Markus

(attachments)

dvect_debug.txt (2.83 KB)

Glynn Clements wrote:

> GRASS 6.3.cvs (spearfish60):~/grass63/scripts > GRASS_RENDER_IMMEDIATE=TRUE gdb d.vect

> PNG: GRASS_TRUECOLOR status: FALSE

I missed this at first. After setting GRASS_TRUECOLOR=FALSE, I managed
to reproduce this and find the problem.

I can't think of anything except for memory corruption. But:

> #3 0xb7e61c62 in PNG_Graph_set (argc=0, argv=0x0) at Graph_set.c:133
> #4 0xb7e57351 in COM_Graph_set (argc=0, argv=0x0) at Graph.c:7
> #5 0xb7e5869e in LIB_init (drv=0xb7e64f40, argc=0, argv=0x0) at init.c:79
> #6 0xb7e6ad92 in LOC_open_driver () at loc_io.c:67
> #7 0xb7e6a014 in R_open_driver () at com_io.c:180
> #8 0x0804d7f5 in main (argc=3, argv=0xbfde6774) at main.c:375

375 if (R_open_driver() != 0)

This is still really early, not long after G_parser() has returned. I
can't see how anything in the actual d.vect code could cause this.
And, if it's a problem with the display architecture, I would expect
it to affect a lot more than just d.vect.

It's due to a combination of factors:

1. Both libpngdriver and d.vect define variables named "palette".
2. GRASS_TRUECOLOR=FALSE initialises the wrong "palette" variable
(libpngdriver's is 256*4 bytes, d.vect's is only 16*4 bytes).
3. stdio structures reside shortly after d.vect's palette variable,
so when libpngdriver overflows it, stdio variables get corrupted.

IOW, it's only luck that it hasn't happened before.

The crash disappears if the library is built with -Wl,-Bsymbolic,
which causes libraries to bind to their own variable definitions
rather than allowing them to be overriden by the executable.

That should probably be the default. Relying upon libraries' global
variables not conflicting with those of an executable is bound to be
unreliable.

I can make that change to the Linux section of SC_CONFIG_CFLAGS in
aclocal.m4. It's already the default on Windows (you have to go to
some trouble to export symbols from the executable into a DLL). That
just leaves the sections for MacOSX, plus a couple of dozen platforms
which we pretend to support but don't really (i.e. all of the
commercial Unices).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote:

The crash disappears if the library is built with -Wl,-Bsymbolic,
which causes libraries to bind to their own variable definitions
rather than allowing them to be overriden by the executable.

That should probably be the default. Relying upon libraries' global
variables not conflicting with those of an executable is bound to be
unreliable.

Unfortunately, that method appears to prevent executables from
referencing variables which are defined in a library. This is most
noticable with XDRIVER, which uses several variables which are defined
in libdriver. If libdriver is built with -Bsymbolic, XDRIVER ends up
getting its own copies, which doesn't work.

For now, I'm just going to rename the "palette" variable in
libpngdriver, so that d.vect works.

Ultimately, we need to avoid relying upon variables exported from
libraries. Doing so is quite fragile, depending upon platform and
linker issues.

--
Glynn Clements <glynn@gclements.plus.com>