[GRASS5] Tcl8.4 support?

> The CVS HEAD version of NVIZ still doesn't work on Debian Testing
> with tcl8.4.

...

> (gets stuck on "exec")

...

could you add some "strace" into etc/nviz2.2/scripts/nviz2.2_script
to debug the binary problem and send the outcome (at least the
crashing part)?

Maybe we get an idea what's wrong.

This compile: instead of stopping on the first "exec" in nviz2.2_script,
it segfaults again.

tcl8.3-dev & tk8.3-dev packages have been removed from the system.
compiled with --with-tcltk-includes=/usr/include/tcl8.4

...
Hamish

without strace:
------------------
GRASS:~ > nviz i2

Version: @(#) 5.0.3 (August 2003)

Authors: Bill Brown, Terry Baker, Mark Astley, David Gerdes
  modifications: Jaro Hofierka, Bob Covill

Please cite one or more of the following references in publications
where the results of this program were used:
Brown, W.M., Astley, M., Baker, T., Mitasova, H. (1995).
GRASS as an Integrated GIS and Visualization System for
Spatio-Temporal Modeling, Proceedings of Auto Carto 12, Charlotte, N.C.

Mitasova, H., W.M. Brown, J. Hofierka, 1994, Multidimensional
dynamic cartography. Kartograficke listy, 2, p. 37-50.

Mitas L., Brown W. M., Mitasova H., 1997, Role of dynamic
cartography in simulations of landscape processes based on multi-variate
fields. Computers and Geosciences, Vol. 23, No. 4, pp. 437-446

http://www2.gis.uiuc.edu:2280/modviz/viz/nviz.html

The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
Update elev null mask
building color table
child killed: segmentation violation
    while executing
"exec
/usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/etc/nviz2.2
/NVWISH2.2 -f
/usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/..."
("eval" body line 1) invoked from within
"eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f
$env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script $argv -name NVIZ

&@stdout" invoked from within

"if {$argv == ""} {
#no arguments
eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f
$env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ >&@stdo..."
  (file
"/usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/bin/nviz"
line 16)

with strace:
---------------------

-- Here is the end of the log, the full log is at
http://bambi.otago.ac.nz/hamish/grass/nviz_tcl8.4.txt

GRASS:~ > strace nviz i2 > nviz_tcl8.4_crash.txt 2>&1

[...]
Version: @(#) 5.0.3 (August 2003)

Authors: Bill Brown, Terry Baker, Mark Astley, David Gerdes
        modifications: Jaro Hofierka, Bob Covill

Please cite one or more of the following references in publications
where the results of this program were used:
Brown, W.M., Astley, M., Baker, T., Mitasova, H. (1995).
GRASS as an Integrated GIS and Visualization System for
Spatio-Temporal Modeling, Proceedings of Auto Carto 12, Charlotte, N.C.

Mitasova, H., W.M. Brown, J. Hofierka, 1994, Multidimensional
dynamic cartography. Kartograficke listy, 2, p. 37-50.

Mitas L., Brown W. M., Mitasova H., 1997, Role of dynamic
cartography in simulations of landscape processes based on multi-variate
fields. Computers and Geosciences, Vol. 23, No. 4, pp. 437-446

http://www2.gis.uiuc.edu:2280/modviz/viz/nviz.html

The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
Update elev null mask
building color table
[WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV], 0, NULL) = 5475
--- SIGCHLD (Child exited) @ 0 (0) ---
brk(0) = 0x8085000
brk(0x8089000) = 0x8089000
write(2, "child killed: segmentation viola"..., 658child killed: segmentation violation
    while executing
"exec /usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/etc/nviz2.2/NVWISH2.2 -f /usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/..."
    ("eval" body line 1)
    invoked from within
"eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f $env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script $argv -name NVIZ >&@stdout"
    invoked from within
"if {$argv == ""} {
#no arguments
eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f $env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ >&@stdo..."
    (file "/usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/bin/nviz" line 16)) = 658
write(2, "\n", 1
) = 1
fcntl64(2, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl64(2, F_SETFL, O_WRONLY|O_LARGEFILE) = 0
fcntl64(2, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl64(1, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl64(1, F_SETFL, O_WRONLY|O_LARGEFILE) = 0
fcntl64(1, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl64(0, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(0, F_SETFL, O_RDWR) = 0
fcntl64(0, F_GETFL) = 0x2 (flags O_RDWR)
write(6, "q", 1) = 1
close(6) = 0
kill(5471, SIGRTMIN) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\7\0\0\0)\346\f@@\362\377\277pu\f@`\"\f@\7\335"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
write(4, "\200 \r@\2\0\0\0\1\0\0\0j\352\25@8\362\377\277\340\222"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend( <unfinished ...>
--- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
wait4(5470, NULL, __WCLONE, NULL) = 5470
exit_group(1) = ?

Hamish wrote:

> > The CVS HEAD version of NVIZ still doesn't work on Debian Testing
> > with tcl8.4.
...
> > (gets stuck on "exec")
...
> could you add some "strace" into etc/nviz2.2/scripts/nviz2.2_script
> to debug the binary problem and send the outcome (at least the
> crashing part)?
>
> Maybe we get an idea what's wrong.

This compile: instead of stopping on the first "exec" in nviz2.2_script,
it segfaults again.

with strace:

You would need to use "strace -f ..." to trace child processes. Only
tracing the top-level process won't tell you very much.

--
Glynn Clements <glynn.clements@virgin.net>

> > > The CVS HEAD version of NVIZ still doesn't work on Debian Testing
> > > with tcl8.4.
> ...
> > > (gets stuck on "exec")
> ...
> > could you add some "strace" into etc/nviz2.2/scripts/nviz2.2_script
> > to debug the binary problem and send the outcome (at least the
> > crashing part)?

...

You would need to use "strace -f ..." to trace child processes. Only
tracing the top-level process won't tell you very much.

Ok, here is the full log:
http://bambi.otago.ac.nz/hamish/grass/strace-f_nviz_tcl8.4.txt.gz

Here's the bit with the SegFault: old_mmap() / munmap() ?
(...)
[pid 9394] fcntl64(10, F_GETFL) = 0 (flags O_RDONLY)
[pid 9394] fstat64(10, {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
[pid 9394] old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000
[pid 9394] _llseek(10, 0, [0], SEEK_CUR) = 0
[pid 9394] read(10, "11 12\n", 4096) = 6
[pid 9394] close(10) = 0
[pid 9394] munmap(0x40015000, 4096) = 0
[pid 9394] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 9388 resumed
Process 9394 detached
[pid 9388] <... wait4 resumed> [WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV], 0, NULL) = 9394
[pid 9388] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 9388] brk(0) = 0x8085000
[pid 9388] brk(0x8089000) = 0x8089000
[pid 9388] write(2, "child killed: segmentation viola"..., 658child killed: segmentation violation
    while executing
"exec /usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/etc/nviz2.2/NVWISH2.2 -f /usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/..."
    ("eval" body line 1)
    invoked from within
"eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f $env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script $argv -name NVIZ >&@stdout"
    invoked from within
"if {$argv == ""} {
#no arguments
eval exec $env(GISBASE)/etc/nviz2.2/NVWISH2.2 -f $env(GISBASE)/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ >&@stdo..."
    (file "/usr/src/grass5source/grass-5.0.3rc2/dist.i686-pc-linux-gnu/bin/nviz" line 16)) = 658
[pid 9388] write(2, "\n", 1
) = 1
[pid 9388] fcntl64(2, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
[pid 9388] fcntl64(2, F_SETFL, O_WRONLY|O_LARGEFILE) = 0
[pid 9388] fcntl64(2, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
[pid 9388] fcntl64(1, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
[pid 9388] fcntl64(1, F_SETFL, O_WRONLY|O_LARGEFILE) = 0
[pid 9388] fcntl64(1, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
[pid 9388] fcntl64(0, F_GETFL) = 0x2 (flags O_RDWR)
[pid 9388] fcntl64(0, F_SETFL, O_RDWR) = 0
[pid 9388] fcntl64(0, F_GETFL) = 0x2 (flags O_RDWR)
[pid 9388] write(6, "q", 1 <unfinished ...>
[pid 9390] <... select resumed> ) = 1 (in [5])
[pid 9390] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
[pid 9390] rt_sigsuspend( <unfinished ...>
[pid 9388] <... write resumed> ) = 1
[pid 9388] close(6) = 0
[pid 9388] kill(9390, SIGRTMIN <unfinished ...>
[pid 9390] --- SIGRTMIN (Real-time signal 0) @ 0 (0) ---
(...)

Hamish wrote:

> > > > The CVS HEAD version of NVIZ still doesn't work on Debian Testing
> > > > with tcl8.4.
> > ...
> > > > (gets stuck on "exec")
> > ...
> > > could you add some "strace" into etc/nviz2.2/scripts/nviz2.2_script
> > > to debug the binary problem and send the outcome (at least the
> > > crashing part)?
...
> You would need to use "strace -f ..." to trace child processes. Only
> tracing the top-level process won't tell you very much.

Ok, here is the full log:
http://bambi.otago.ac.nz/hamish/grass/strace-f_nviz_tcl8.4.txt.gz

In retrospect, even tracing the child process won't normally tell you
much about what caused a segfault. Sometimes it may provide a clue,
e.g. if a system call failed immediately before the segfault, that's
often an indication that the code doesn't allow for the call failing.
But in most cases, the segfault is unrelated to system calls.

If the process generated a core file, use a debugger to examine it;
otherwise, you would have to run NVIZ under a debugger.

AFAICT from the error message, it's the core NVIZ process which
crashed; i.e. not the "nviz" script, but the NVWISH2.2 binary
interpreting nviz2.2_script. To debug that, you would need to emulate
the "nviz" script by setting up the various environment variables,
then running NVWISH2.2 under gdb, e.g.:

  export GISDBASE=`g.gisenv get=GISDBASE`
  export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
  export MAPSET=`g.gisenv get=MAPSET`
  gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
  > run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ

Here's the bit with the SegFault: old_mmap() / munmap() ?

old_mmap(..., MAP_PRIVATE|MAP_ANONYMOUS, ...) is one of the ways in
which malloc() gets more memory from the system. A call to munmap()
with the address returned from old_mmap() corresponds to that memory
having been freed. I doubt that this has anything to do with the
segfault; more likely, the mmap() call just happened to the last thing
before the crash.

--
Glynn Clements <glynn.clements@virgin.net>

> > > > > The CVS HEAD version of NVIZ still doesn't work on Debian
> > > > > Testing with tcl8.4.
> > > ...
> > > > > (gets stuck on "exec")
> > > ...
> > > > could you add some "strace" into
> > > > etc/nviz2.2/scripts/nviz2.2_script to debug the binary problem
> > > > and send the outcome (at least the crashing part)?
> ...
> > You would need to use "strace -f ..." to trace child processes.
> > Only tracing the top-level process won't tell you very much.
>
> Ok, here is the full log:
> http://bambi.otago.ac.nz/hamish/grass/strace-f_nviz_tcl8.4.txt.gz

In retrospect, even tracing the child process won't normally tell you
much about what caused a segfault. Sometimes it may provide a clue,
e.g. if a system call failed immediately before the segfault, that's
often an indication that the code doesn't allow for the call failing.
But in most cases, the segfault is unrelated to system calls.

If the process generated a core file, use a debugger to examine it;
otherwise, you would have to run NVIZ under a debugger.

AFAICT from the error message, it's the core NVIZ process which
crashed; i.e. not the "nviz" script, but the NVWISH2.2 binary
interpreting nviz2.2_script. To debug that, you would need to emulate
the "nviz" script by setting up the various environment variables,
then running NVWISH2.2 under gdb, e.g.:

  export GISDBASE=`g.gisenv get=GISDBASE`
  export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
  export MAPSET=`g.gisenv get=MAPSET`
  gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
  > run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script
  > -name NVIZ

Ok, copy that.

I'm new to gdb, bear with me.
   -How to turn on debugging symbols?
   -How to set up symbol tables info? [(gdb) bt f]
      "No symbol table info available."

Further instructions?

Seeing the libc.so.6 makes me ask: can anyone out there running
Debian 3.0(Woody) compile 5.0.3rc with tcl8.4-dev & tk8.4-dev
and see if NVIZ works? Is this a Debian bug?
(I'm running Debian/Testing(Sarge), grass-5.0.3rc2)

note NVIZ is printing "Version: @(#) 5.0.2 (April 2003)" before
the copyright notice..?

Hamish

---------------------------------------------------------------------

GRASS:~ > g.version
GRASS 5.0.3 (August 2003)

GRASS:~ > export GISDBASE=`g.gisenv get=GISDBASE`
GRASS:~ > export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
GRASS:~ > export MAPSET=`g.gisenv get=MAPSET`
GRASS:~ > gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
GNU gdb 5.3-debian
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux"...(no debugging symbols found)...
(gdb) run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ
Starting program: /usr/local/grass5/etc/nviz2.2/NVWISH2.2 -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...[New Thread 16384 (LWP 5725)]
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...[New Thread 32769 (LWP 5726)]
[New Thread 16386 (LWP 5727)]
(no debugging symbols found)...

OPTION: Raster file(s) for Elevation
     key: elevation
required: NO
multiple: YES

Enter the name of an existing Raster file
Enter 'list' for a list of existing Raster files
Hit RETURN to cancel request

i2

<i2>

Enter the name of an existing Raster file
[...]
Version: @(#) 5.0.2 (April 2003)

Authors: Bill Brown, Terry Baker, Mark Astley, David Gerdes
  modifications: Jaro Hofierka, Bob Covill

Please cite one or more of the following references in publications
where the results of this program were used:
Brown, W.M., Astley, M., Baker, T., Mitasova, H. (1995).
GRASS as an Integrated GIS and Visualization System for
Spatio-Temporal Modeling, Proceedings of Auto Carto 12, Charlotte, N.C.

Mitasova, H., W.M. Brown, J. Hofierka, 1994, Multidimensional
dynamic cartography. Kartograficke listy, 2, p. 37-50.

Mitas L., Brown W. M., Mitasova H., 1997, Role of dynamic
cartography in simulations of landscape processes based on multi-variate
fields. Computers and Geosciences, Vol. 23, No. 4, pp. 437-446

http://www2.gis.uiuc.edu:2280/modviz/viz/nviz.html

The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
  45%
Update elev null mask
building color table
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 5725)]
0x4042452d in vfprintf () from /lib/libc.so.6
(gdb) bt
#0 0x4042452d in vfprintf () from /lib/libc.so.6
#1 0x4043d386 in vsprintf () from /lib/libc.so.6
#2 0x4042c21a in sprintf () from /lib/libc.so.6
#3 0x0808edd1 in strcpy ()
#4 0x0805c2f8 in strcpy ()
#5 0x08052424 in strcpy ()
#6 0x080506a1 in strcpy ()
#7 0x401357ff in TclInvokeStringCommand () from /usr/lib/libtcl8.4.so.0
#8 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#9 0x4013707c in Tcl_EvalEx () from /usr/lib/libtcl8.4.so.0
#10 0x40137495 in Tcl_Eval () from /usr/lib/libtcl8.4.so.0
#11 0x401387d3 in Tcl_VarEvalVA () from /usr/lib/libtcl8.4.so.0
#12 0x40138807 in Tcl_VarEval () from /usr/lib/libtcl8.4.so.0
#13 0x080536d7 in strcpy ()
#14 0x08053bc2 in strcpy ()
#15 0x0804c207 in strcpy ()
#16 0x40082452 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
#17 0x0805749c in strcpy ()
#18 0x403f6a51 in __libc_start_main () from /lib/libc.so.6
(gdb) info f
Stack level 0, frame at 0xbfffe868:
eip = 0x4042452d in vfprintf; saved eip 0x4043d386
called by frame at 0xbfffe938
Arglist at 0xbfffe868, args:
Locals at 0xbfffe868, Previous frame's sp in esp
Saved registers:
  ebx at 0xbfffe85c, ebp at 0xbfffe868, esi at 0xbfffe860, edi at 0xbfffe864, eip at 0xbfffe86c
(gdb) info args
No symbol table info available.
(gdb) info locals
No symbol table info available.
(gdb) quit
A debugging session is active.
Do you still want to close the debugger?(y or n) y
GRASS:~ >

Hamish wrote:

> AFAICT from the error message, it's the core NVIZ process which
> crashed; i.e. not the "nviz" script, but the NVWISH2.2 binary
> interpreting nviz2.2_script. To debug that, you would need to emulate
> the "nviz" script by setting up the various environment variables,
> then running NVWISH2.2 under gdb, e.g.:
>
> export GISDBASE=`g.gisenv get=GISDBASE`
> export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
> export MAPSET=`g.gisenv get=MAPSET`
> gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
> > run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script
> > -name NVIZ
>

Ok, copy that.

I'm new to gdb, bear with me.
   -How to turn on debugging symbols?
   -How to set up symbol tables info? [(gdb) bt f]
      "No symbol table info available."

Odd. Has the NVWISH2.2 binary been stripped?

Further instructions?

Seeing the libc.so.6 makes me ask: can anyone out there running
Debian 3.0(Woody) compile 5.0.3rc with tcl8.4-dev & tk8.4-dev
and see if NVIZ works? Is this a Debian bug?
(I'm running Debian/Testing(Sarge), grass-5.0.3rc2)

note NVIZ is printing "Version: @(#) 5.0.2 (April 2003)" before
the copyright notice..?

That suggests that either configure hasn't been re-run since the last
update, or NVIZ hasn't been re-built (forgot to run "make distclean"
first), or that NVIZ failed to compile and the old version is still in
place (forgot to "rm -rf /usr/local/grass5" before "make install").

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 5725)]
0x4042452d in vfprintf () from /lib/libc.so.6
(gdb) bt
#0 0x4042452d in vfprintf () from /lib/libc.so.6
#1 0x4043d386 in vsprintf () from /lib/libc.so.6
#2 0x4042c21a in sprintf () from /lib/libc.so.6
#3 0x0808edd1 in strcpy ()
#4 0x0805c2f8 in strcpy ()
#5 0x08052424 in strcpy ()
#6 0x080506a1 in strcpy ()
#7 0x401357ff in TclInvokeStringCommand () from /usr/lib/libtcl8.4.so.0
#8 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#9 0x4013707c in Tcl_EvalEx () from /usr/lib/libtcl8.4.so.0
#10 0x40137495 in Tcl_Eval () from /usr/lib/libtcl8.4.so.0
#11 0x401387d3 in Tcl_VarEvalVA () from /usr/lib/libtcl8.4.so.0
#12 0x40138807 in Tcl_VarEval () from /usr/lib/libtcl8.4.so.0
#13 0x080536d7 in strcpy ()
#14 0x08053bc2 in strcpy ()
#15 0x0804c207 in strcpy ()
#16 0x40082452 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
#17 0x0805749c in strcpy ()
#18 0x403f6a51 in __libc_start_main () from /lib/libc.so.6

If tcltkgrass is crashing inside "pure" Tcl/Tk code, you won't be able
to find out much unless you have a version of libtcl which has debug
info.

--
Glynn Clements <glynn.clements@virgin.net>

Two steps forward, another back. A couple of mistakes by me & bugs
getting mixed up here .. and still no joy.

Turns out the raster I was testing NVIZ out with is a CELL map.

Tried to run nviz on an integer based raster on a RedHat install & it
gave the same SegFault.

So NVIZ doesn't like integer based maps.
It also fails with pages of errors if there is no data in the
current region (d.zoom to an all null part of the map).

If these will always fail, checks early on & a graceful exit with
useful error messages would be better..

So that explains the SegFault anyway.

-----

But back to Debian with TclTk 8.4 and the bug as reported in
http://article.gmane.org/gmane.comp.gis.grass.devel/1883

"it does freeze up (0% cpu) at the first instance of "exec"
in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels section)
    set panels [exec cat $index]

(I tried placing some exec's further up in the script & they did the
same lock-up)"

sample run:
GRASS:~ > nviz raster_dem
[...]
The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
Loading Data
translating colors from fp
Adding panels from /usr/src/grass5source/grass-5.0.3rc3/dist.i686-pc-linux-gnu/etc/nviz2.2/scripts

and then it just sits there using 0% cpu with the "Please Wait.." window up.

strace -f ends with looping getppid() every 2 seconds or so:
[...]
Process 7403 detached
[pid 7402] close(14) = 0
[pid 7402] read(13, <unfinished ...>
[pid 7397] <... poll resumed> [{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[etc...]
until I kill it.

full log here:
  http://bambi.otago.ac.nz/hamish/grass/strace-f_nviz.log.gz

Running gdb, then ^Z when it gets stuck:

GRASS:./grass-5.0.3rc3 > gdb dist.i686-pc-linux-gnu/etc/nviz2.2/NVWISH2.2 GNU gdb 5.3-debian
Copyright 2002 Free Software Foundation, Inc.
[...]
(gdb) run -f dist.i686-pc-linux-gnu/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ
[...]
Enter the name of an existing Raster file
Enter 'list' for a list of existing Raster files
Hit RETURN to cancel request

raster_dem

[...]
The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
Loading Data
translating colors from fp
Adding panels from /usr/src/grass5source/grass-5.0.3rc3/dist.i686-pc-linux-gnu/etc/nviz2.2/scripts

[<wait> Ctrl-Z]
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGTSTP, Stopped (user).
[Switching to Thread 16384 (LWP 7482)]
0x4049ba54 in read () from /lib/libc.so.6
(gdb) bt
#0 0x4049ba54 in read () from /lib/libc.so.6
#1 0x401b3004 in _LIB_VERSION () from /usr/lib/libtcl8.4.so.0
#2 0x4018193f in TclCreatePipeline () from /usr/lib/libtcl8.4.so.0
#3 0x40182195 in Tcl_OpenCommandChannel () from /usr/lib/libtcl8.4.so.0
#4 0x4016ee7a in Tcl_ExecObjCmd () from /usr/lib/libtcl8.4.so.0
#5 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#6 0x401594a2 in TclCompEvalObj () from /usr/lib/libtcl8.4.so.0
#7 0x40158adc in TclCompEvalObj () from /usr/lib/libtcl8.4.so.0
#8 0x4013764d in Tcl_EvalObjEx () from /usr/lib/libtcl8.4.so.0
#9 0x4013ce8c in Tcl_ForeachObjCmd () from /usr/lib/libtcl8.4.so.0
#10 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#11 0x4013707c in Tcl_EvalEx () from /usr/lib/libtcl8.4.so.0
#12 0x40171aa6 in Tcl_FSEvalFile () from /usr/lib/libtcl8.4.so.0
#13 0x40170c2a in Tcl_EvalFile () from /usr/lib/libtcl8.4.so.0
#14 0x400824a9 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
#15 0x080573bc in strcpy ()
#16 0x403f9a51 in __libc_start_main () from /lib/libc.so.6
(gdb) info f
Stack level 0, frame at 0xbfffea98:
eip = 0x4049ba54 in read; saved eip 0x4018193f
(FRAMELESS), called by frame at 0xbfffea98
Arglist at 0xbfffea98, args:
Locals at 0xbfffea98, Previous frame's sp in esp
Saved registers:
  ebp at 0xbfffea98, eip at 0xbfffea9c
(gdb) l
No symbol table is loaded. Use the "file" command.
(gdb) info args
No symbol table info available.
(gdb) info locals
No symbol table info available.
(gdb) quit
A debugging session is active.
Do you still want to close the debugger?(y or n) y
GRASS:/usr/src/grass5source/grass-5.0.3rc3 >

I'll (re)confirm Tcl/Tk 8.3 doesn't show this behavior tomorrow.

still confused, but slightly less so,
Hamish

-------------------------------------------
old business:

> > AFAICT from the error message, it's the core NVIZ process which
> > crashed; i.e. not the "nviz" script, but the NVWISH2.2 binary
> > interpreting nviz2.2_script. To debug that, you would need to
> > emulate the "nviz" script by setting up the various environment
> > variables, then running NVWISH2.2 under gdb, e.g.:
> >
> > export GISDBASE=`g.gisenv get=GISDBASE`
> > export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
> > export MAPSET=`g.gisenv get=MAPSET`
> > gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
> > > run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script
> > > -name NVIZ

...

> I'm new to gdb, bear with me.
> -How to turn on debugging symbols?
> -How to set up symbol tables info? [(gdb) bt f]
> "No symbol table info available."

Odd. Has the NVWISH2.2 binary been stripped?

Nope. At least not by me at the command line.

> note NVIZ is printing "Version: @(#) 5.0.2 (April 2003)" before
> the copyright notice..?

That suggests that either configure hasn't been re-run since the last
update, or NVIZ hasn't been re-built (forgot to run "make distclean"
first), or that NVIZ failed to compile and the old version is still in
place (forgot to "rm -rf /usr/local/grass5" before "make install").

That was a mistake on my part. I hadn't installed with 'make install', I
was running GRASS from bin.i686-pc-linux-gnu/grass5. I then told gdb to
run NVIZ from /usr/local/grass5, which was 5.0.2.. A 'make uninstall'
later and a fresh compile of 5.0.3rc3 directly from the tarball gives
the same SegFault & gdb session though [turns out that was the CELL map].

Where do you change the Makefiles to have GRASS compile with "gcc-2.95"
instead of"gcc" (ie gcc-3.3) ?

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 16384 (LWP 5725)]
> 0x4042452d in vfprintf () from /lib/libc.so.6
> (gdb) bt
> #0 0x4042452d in vfprintf () from /lib/libc.so.6
> #1 0x4043d386 in vsprintf () from /lib/libc.so.6
> #2 0x4042c21a in sprintf () from /lib/libc.so.6
> #3 0x0808edd1 in strcpy ()
> #4 0x0805c2f8 in strcpy ()
> #5 0x08052424 in strcpy ()
> #6 0x080506a1 in strcpy ()
> #7 0x401357ff in TclInvokeStringCommand () from
> /usr/lib/libtcl8.4.so.0#8 0x40136646 in TclEvalObjvInternal () from
> /usr/lib/libtcl8.4.so.0#9 0x4013707c in Tcl_EvalEx () from
> /usr/lib/libtcl8.4.so.0#10 0x40137495 in Tcl_Eval () from
> /usr/lib/libtcl8.4.so.0#11 0x401387d3 in Tcl_VarEvalVA () from
> /usr/lib/libtcl8.4.so.0#12 0x40138807 in Tcl_VarEval () from
> /usr/lib/libtcl8.4.so.0#13 0x080536d7 in strcpy ()
> #14 0x08053bc2 in strcpy ()
> #15 0x0804c207 in strcpy ()
> #16 0x40082452 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
> #17 0x0805749c in strcpy ()
> #18 0x403f6a51 in __libc_start_main () from /lib/libc.so.6

If tcltkgrass is crashing inside "pure" Tcl/Tk code, you won't be able
to find out much unless you have a version of libtcl which has debug
info.

I think (could be wrong) Debian binaries are stripped by default.

Hamish wrote:

"it does freeze up (0% cpu) at the first instance of "exec"
in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels section)
    set panels [exec cat $index]

One thing which might make a difference: try adding "-lpthread" to the
link command. I've occasionally encountered all sorts of odd behaviour
with programs which are linked against libraries which are linked
against libpthread (typically libGL), but where the binary itself
isn't linked directly against libpthread.

Other than that, given that the program is hanging in "pure" Tcl code
(no NVIZ-specific functions in the backtrace), I can only assume that
something in the NVIZ initialisation is breaking Tcl.

--
Glynn Clements <glynn.clements@virgin.net>

Hamish wrote:

old business:

In my previous message, I hadn't noticed that you continued after the
signature.

Where do you change the Makefiles to have GRASS compile with "gcc-2.95"
instead of"gcc" (ie gcc-3.3) ?

To select a different compiler, set the CC environment variable prior
to running configure e.g.

  CC=gcc-2.95 ./configure ...

Similarly for CFLAGS and LDFLAGS, e.g.

  CFLAGS='-g -Wall' ./configure ...

You can also override make variables with e.g.

  make CC=gcc-2.95

However, if you do this, bear in mind that the settings in head.<arch>
and config.h may not be appropriate.

> If tcltkgrass is crashing inside "pure" Tcl/Tk code, you won't be able
> to find out much unless you have a version of libtcl which has debug
> info.

I think (could be wrong) Debian binaries are stripped by default.

Most distributions strip the dynamic libraries; sometimes the -devel
package includes static libraries which aren't stripped, but mostly
you have to build your own libraries if you want debug info.

In any case, I don't think that a debugger will help with the hanging
"exec" problem.

--
Glynn Clements <glynn.clements@virgin.net>

Hamish wrote:

Turns out the raster I was testing NVIZ out with is a CELL map.
Tried to run nviz on an integer based raster on a RedHat install & it
gave the same SegFault.
So NVIZ doesn't like integer based maps.

Hamish I have been using nviz with CELL maps with no problems (e.g. figs 5.1b,6.8a,7.2a in the book) and I tried it with the recent nviz version from CVS right now and it runs OK too.

It also fails with pages of errors if there is no data in the
current region (d.zoom to an all null part of the map).

yes, it crashes if one tries to display a map that is out of the given region - it obviously tries to compute some of the initial variables that is undefined - I got this error (it looks like something for the cutting plane pannel):

Error in startup script: integer value too large to represent
     while executing
"expr int([lindex $range 1])"
     (procedure "mkcutplanePanel" line 55)
     invoked from within
"mk$name\Panel $path"
     (procedure "Nv_force_panel" line 10)
     invoked from within

Helena
....

If these will always fail, checks early on & a graceful exit with
useful error messages would be better..

So that explains the SegFault anyway.

-----

But back to Debian with TclTk 8.4 and the bug as reported in
http://article.gmane.org/gmane.comp.gis.grass.devel/1883

"it does freeze up (0% cpu) at the first instance of "exec"
in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels section)
    set panels [exec cat $index]

(I tried placing some exec's further up in the script & they did the
same lock-up)"

sample run:
GRASS:~ > nviz raster_dem
[...]
The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
translating colors from fp
Adding panels from /usr/src/grass5source/grass-5.0.3rc3/dist.i686-pc-linux-gnu/etc/nviz2.2/scripts

and then it just sits there using 0% cpu with the "Please Wait.." window up.

strace -f ends with looping getppid() every 2 seconds or so:
[...]
Process 7403 detached
[pid 7402] close(14) = 0
[pid 7402] read(13, <unfinished ...>
[pid 7397] <... poll resumed> [{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[pid 7397] getppid() = 7396
[pid 7397] poll([{fd=3, events=POLLIN}], 1, 2000) = 0
[etc...]
until I kill it.

full log here:
  http://bambi.otago.ac.nz/hamish/grass/strace-f_nviz.log.gz

Running gdb, then ^Z when it gets stuck:

GRASS:./grass-5.0.3rc3 > gdb dist.i686-pc-linux-gnu/etc/nviz2.2/NVWISH2.2 GNU gdb 5.3-debian
Copyright 2002 Free Software Foundation, Inc.
[...]
(gdb) run -f dist.i686-pc-linux-gnu/etc/nviz2.2/scripts/nviz2.2_script -name NVIZ
[...]
Enter the name of an existing Raster file
Enter 'list' for a list of existing Raster files
Hit RETURN to cancel request

raster_dem

[...]
The papers are available at
http://www2.gis.uiuc.edu:2280/modviz/
Loading Data
translating colors from fp
Adding panels from /usr/src/grass5source/grass-5.0.3rc3/dist.i686-pc-linux-gnu/etc/nviz2.2/scripts

[<wait> Ctrl-Z]
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGTSTP, Stopped (user).
[Switching to Thread 16384 (LWP 7482)]
0x4049ba54 in read () from /lib/libc.so.6
(gdb) bt
#0 0x4049ba54 in read () from /lib/libc.so.6
#1 0x401b3004 in _LIB_VERSION () from /usr/lib/libtcl8.4.so.0
#2 0x4018193f in TclCreatePipeline () from /usr/lib/libtcl8.4.so.0
#3 0x40182195 in Tcl_OpenCommandChannel () from /usr/lib/libtcl8.4.so.0
#4 0x4016ee7a in Tcl_ExecObjCmd () from /usr/lib/libtcl8.4.so.0
#5 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#6 0x401594a2 in TclCompEvalObj () from /usr/lib/libtcl8.4.so.0
#7 0x40158adc in TclCompEvalObj () from /usr/lib/libtcl8.4.so.0
#8 0x4013764d in Tcl_EvalObjEx () from /usr/lib/libtcl8.4.so.0
#9 0x4013ce8c in Tcl_ForeachObjCmd () from /usr/lib/libtcl8.4.so.0
#10 0x40136646 in TclEvalObjvInternal () from /usr/lib/libtcl8.4.so.0
#11 0x4013707c in Tcl_EvalEx () from /usr/lib/libtcl8.4.so.0
#12 0x40171aa6 in Tcl_FSEvalFile () from /usr/lib/libtcl8.4.so.0
#13 0x40170c2a in Tcl_EvalFile () from /usr/lib/libtcl8.4.so.0
#14 0x400824a9 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
#15 0x080573bc in strcpy ()
#16 0x403f9a51 in __libc_start_main () from /lib/libc.so.6
(gdb) info f
Stack level 0, frame at 0xbfffea98:
eip = 0x4049ba54 in read; saved eip 0x4018193f
(FRAMELESS), called by frame at 0xbfffea98
Arglist at 0xbfffea98, args: Locals at 0xbfffea98, Previous frame's sp in esp
Saved registers:
  ebp at 0xbfffea98, eip at 0xbfffea9c
(gdb) l
No symbol table is loaded. Use the "file" command.
(gdb) info args
No symbol table info available.
(gdb) info locals
No symbol table info available.
(gdb) quit
A debugging session is active.
Do you still want to close the debugger?(y or n) y
GRASS:/usr/src/grass5source/grass-5.0.3rc3 >

I'll (re)confirm Tcl/Tk 8.3 doesn't show this behavior tomorrow.

still confused, but slightly less so,
Hamish

-------------------------------------------
old business:

AFAICT from the error message, it's the core NVIZ process which
crashed; i.e. not the "nviz" script, but the NVWISH2.2 binary
interpreting nviz2.2_script. To debug that, you would need to
emulate the "nviz" script by setting up the various environment
variables, then running NVWISH2.2 under gdb, e.g.:

export GISDBASE=`g.gisenv get=GISDBASE`
export LOCATION_NAME=`g.gisenv get=LOCATION_NAME`
export MAPSET=`g.gisenv get=MAPSET`
gdb /usr/local/grass5/etc/nviz2.2/NVWISH2.2
> run -f /usr/local/grass5/etc/nviz2.2/scripts/nviz2.2_script
> -name NVIZ

...

I'm new to gdb, bear with me. -How to turn on debugging symbols?
  -How to set up symbol tables info? [(gdb) bt f]
     "No symbol table info available."

Odd. Has the NVWISH2.2 binary been stripped?

Nope. At least not by me at the command line.

note NVIZ is printing "Version: @(#) 5.0.2 (April 2003)" before the copyright notice..?

That suggests that either configure hasn't been re-run since the last
update, or NVIZ hasn't been re-built (forgot to run "make distclean"
first), or that NVIZ failed to compile and the old version is still in
place (forgot to "rm -rf /usr/local/grass5" before "make install").

That was a mistake on my part. I hadn't installed with 'make install', I
was running GRASS from bin.i686-pc-linux-gnu/grass5. I then told gdb to
run NVIZ from /usr/local/grass5, which was 5.0.2.. A 'make uninstall'
later and a fresh compile of 5.0.3rc3 directly from the tarball gives
the same SegFault & gdb session though [turns out that was the CELL map].

Where do you change the Makefiles to have GRASS compile with "gcc-2.95"
instead of"gcc" (ie gcc-3.3) ?

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 5725)]
0x4042452d in vfprintf () from /lib/libc.so.6
(gdb) bt
#0 0x4042452d in vfprintf () from /lib/libc.so.6
#1 0x4043d386 in vsprintf () from /lib/libc.so.6
#2 0x4042c21a in sprintf () from /lib/libc.so.6
#3 0x0808edd1 in strcpy ()
#4 0x0805c2f8 in strcpy ()
#5 0x08052424 in strcpy ()
#6 0x080506a1 in strcpy ()
#7 0x401357ff in TclInvokeStringCommand () from
/usr/lib/libtcl8.4.so.0#8 0x40136646 in TclEvalObjvInternal () from
/usr/lib/libtcl8.4.so.0#9 0x4013707c in Tcl_EvalEx () from
/usr/lib/libtcl8.4.so.0#10 0x40137495 in Tcl_Eval () from
/usr/lib/libtcl8.4.so.0#11 0x401387d3 in Tcl_VarEvalVA () from
/usr/lib/libtcl8.4.so.0#12 0x40138807 in Tcl_VarEval () from
/usr/lib/libtcl8.4.so.0#13 0x080536d7 in strcpy ()
#14 0x08053bc2 in strcpy ()
#15 0x0804c207 in strcpy ()
#16 0x40082452 in Tk_MainEx () from /usr/lib/libtk8.4.so.0
#17 0x0805749c in strcpy ()
#18 0x403f6a51 in __libc_start_main () from /lib/libc.so.6

If tcltkgrass is crashing inside "pure" Tcl/Tk code, you won't be able
to find out much unless you have a version of libtcl which has debug
info.

I think (could be wrong) Debian binaries are stripped by default.

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

> Turns out the raster I was testing NVIZ out with is a CELL map.
> Tried to run nviz on an integer based raster on a RedHat install &
> it gave the same SegFault.
> So NVIZ doesn't like integer based maps.

Hamish I have been using nviz with CELL maps with no problems (e.g.
figs 5.1b,6.8a,7.2a in the book) and I tried it with the recent nviz
version from CVS right now and it runs OK too.

Attached pls find a map that makes NVIZ fail with the SegFault:
(import with r.in.ascii)
...
building color table
child killed: segmentation violation
...

(sometimes)
Fails on Debian/Testing(Sarge) with GRASS 5.0.2, 5.0.3rc3.
Works on Redhat 7.3 with CVS HEAD and 5.0.2.
Fails on Redhat 9 with 5.0.2, 5.0.3rc3. (RH9 uses tcltk 8.3)

5.0.2 on Redhat 9 also fails for other CELL maps I try.
.. so it doesn't seem to be the distro, tcl version, or the map itself.
As it works with 5.0.2 on RH7.3 I'd figure it isn't a fixed bug either.
RH9 gcc 3.2.2, RH7.3 gcc 2.96, Debian gcc 3.3.1 ..

> It also fails with pages of errors if there is no data in the
> current region (d.zoom to an all null part of the map).

yes, it crashes if one tries to display a map that is out of the given
region - it obviously tries to compute some of the initial variables
that is undefined - I got this error (it looks like something for the
cutting plane pannel):

Error in startup script: integer value too large to represent
     while executing
"expr int([lindex $range 1])"
     (procedure "mkcutplanePanel" line 55)
     invoked from within
"mk$name\Panel $path"
     (procedure "Nv_force_panel" line 10)
     invoked from within

Bob has a fix, recompiling for a test now..

Hamish

(attachments)

kills_nviz.Gascii.gz (3.87 KB)

But back to Debian with TclTk 8.4 and the bug as reported in
http://article.gmane.org/gmane.comp.gis.grass.devel/1883

"it does freeze up (0% cpu) at the first instance of "exec"
in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
section) set panels [exec cat $index]

(I tried placing some exec's further up in the script & they did the
same lock-up)"

I'll (re)confirm Tcl/Tk 8.3 doesn't show this behavior tomorrow.

5.0.3rc3 & Debian/Testing(Sarge) with tcl8.3-dev and tk8.3-dev:
works fine.

I'll try tcl8.4 with "-lpthread" next.
(added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)

> > If tcltkgrass is crashing inside "pure" Tcl/Tk code, you won't be
> > able to find out much unless you have a version of libtcl which
> > has debug info.
>
> I think (could be wrong) Debian binaries are stripped by default.

Most distributions strip the dynamic libraries; sometimes the -devel
package includes static libraries which aren't stripped, but mostly
you have to build your own libraries if you want debug info.

just for the record:

http://www.debian.org/doc/debian-policy/ch-files.html#s10.2

In Debian the package can be built from source a bit more cleanly with
"apt-get source <package> --build" instead of downloading the raw
tarball & putting the results in /usr/local/...

I don't know if that way uses the same rules as the standard binary
building (I'd think it would), if so you'd have to leave off the --build
above and remove the strip rule by hand (add nostrip to the rules?)
before running "dpkg --build <package>".

Neither of those two steps will install the resultant .deb

standard disclaimer: don't take my word for it.

Hamish

> But back to Debian with TclTk 8.4 and the bug as reported in
> http://article.gmane.org/gmane.comp.gis.grass.devel/1883
>
> "it does freeze up (0% cpu) at the first instance of "exec"
> in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
> section) set panels [exec cat $index]
>
> (I tried placing some exec's further up in the script & they did the
> same lock-up)"

G: One thing which might make a difference: try adding "-lpthread" to
G: the link command. I've occasionally encountered all sorts of odd
G: behaviour with programs which are linked against libraries which are
G: linked against libpthread (typically libGL), but where the binary
G: itself isn't linked directly against libpthread.

I'll try tcl8.4 with "-lpthread" next.
(added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)

That works(!)

although I guess it really belongs somewhere else in the build system..?
or is it a bug in the debian build of the tcl8.4 libraries? (as 8.3 works ok)

thanks again,
Hamish

Hello Hamish,

At Mon, 15 Sep 2003 19:59:44 +1200 Hamish wrote:

> > But back to Debian with TclTk 8.4 and the bug as reported in
> > http://article.gmane.org/gmane.comp.gis.grass.devel/1883
> >
> > "it does freeze up (0% cpu) at the first instance of "exec"
> > in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
> > section) set panels [exec cat $index]
> >
> > (I tried placing some exec's further up in the script & they did
> > the same lock-up)"

G: One thing which might make a difference: try adding "-lpthread" to
G: the link command. I've occasionally encountered all sorts of odd
G: behaviour with programs which are linked against libraries which
are G: linked against libpthread (typically libGL), but where the
binary G: itself isn't linked directly against libpthread.

> I'll try tcl8.4 with "-lpthread" next.
> (added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)

That works(!)

I can confirm this. Adding -lpthread make NVIZ compile fine and run so
as well. :slight_smile:
using debian testing/unstable with
- gcc (GCC) 3.3.1 20030626 (Debian prerelease)
- tcl/tk8.4

cheers
  Stephan

--
Stephan Holl

GnuPG Key-ID: 11946A09

10:51:03 up 3:30, 1 user, load average: 0.13, 0.54, 0.70

Hamish wrote:

> > But back to Debian with TclTk 8.4 and the bug as reported in
> > http://article.gmane.org/gmane.comp.gis.grass.devel/1883
> >
> > "it does freeze up (0% cpu) at the first instance of "exec"
> > in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
> > section) set panels [exec cat $index]
> >
> > (I tried placing some exec's further up in the script & they did the
> > same lock-up)"

G: One thing which might make a difference: try adding "-lpthread" to
G: the link command. I've occasionally encountered all sorts of odd
G: behaviour with programs which are linked against libraries which are
G: linked against libpthread (typically libGL), but where the binary
G: itself isn't linked directly against libpthread.

> I'll try tcl8.4 with "-lpthread" next.
> (added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)

That works(!)

Damn.

I have absolutely no idea how we could make configure detect this
situation.

Detecting whether libpthread exists is easy enough, but that isn't the
problem. Detecting whether libpthread is needed in order for linking
to succeed is also easy, but that isn't the problem either.

The problem is detecting whether linking NVIZ against libpthread will
make the situation better or worse, and that seems to be somewhere
between difficult and impossible.

The only easy solution is to provide a switch, e.g. --nviz-pthread,
which explicitly add -lpthread to the NVIZ linking switches. At least,
providing the switch is easy enough; dealing with user queries of the
form "what does this switch do and should I use it?" is likely to be
less straightforward.

although I guess it really belongs somewhere else in the build system..?
or is it a bug in the debian build of the tcl8.4 libraries? (as 8.3 works ok)

Possibly.

The problem is that linking against libpthread changes the behaviour
of a number of libc functions, including fork, wait, waitpid, open,
close, system, and anything to do with signal handling. These
functions are declared as weak symbols in libc, and are overriden by
libpthread.

I don't know this for sure, but I believe that what happens when a
program doesn't link against libpthread directly but links against a
library which links against libpthread is that the code in the
executable ends up using the libc versions but the code in the library
uses the libpthread versions.

What I do know for sure is that, in this situation:

a) non-trivial programs start behaving "strange", and

b) gdb (at least on RH6.2) exhibits specific strange behavior, i.e.
everything stops working as soon as the program spawns any child
processes (e.g. calls system()).

AFAICT, any real fix would have to come from the developers of libc,
libpthread and/or ld-linux.so. [Or Linus could admit that the kernel
really does need to provide more than just clone(), which would
eliminate the need for most of the user-space "duct tape" which is at
the root of this problem.]

--
Glynn Clements <glynn.clements@virgin.net>

> > > But back to Debian with TclTk 8.4 and the bug as reported in
> > > http://article.gmane.org/gmane.comp.gis.grass.devel/1883
> > >
> > > "it does freeze up (0% cpu) at the first instance of "exec"
> > > in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
> > > section) set panels [exec cat $index]
> > >
> > > (I tried placing some exec's further up in the script & they did
> > > the same lock-up)"
>
> G: One thing which might make a difference: try adding "-lpthread"
> to the link command. I've occasionally encountered all sorts of
> odd behaviour with programs which are linked against libraries
> which are linked against libpthread (typically libGL), but where
> the binary itself isn't linked directly against libpthread.

...

> > (added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)
>
>
> That works(!)

Damn.

I have absolutely no idea how we could make configure detect this
situation.

...

The only easy solution is to provide a switch, e.g. --nviz-pthread,
which explicitly add -lpthread to the NVIZ linking switches. At least,
providing the switch is easy enough; dealing with user queries of the
form "what does this switch do and should I use it?" is likely to be
less straightforward.

That' still just treating the symptom.. (although that's possibly all we
can do from our end)

AFAIK, this is currently only a problem with Debian/Testing. Is anybody
running Gentoo or another bleeding edge distro who can try this? It
would be bad for the next version of SuSE/RedHat to come out with a new
glibc, tcl, etc & start showing this behaviour too.

maybe this can help shed some light:
http://mail.python.org/pipermail/python-dev/2002-December/031107.html
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=171353
  "threading is an 8.4 feature. It's not even an option in 8.3."

or perhaps this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=201062

If it is limited to Debian/Testing for whatever reason, we can always
just make the adjustment in the Debian package & hopefully if it is a
glibc-land bug it will be sorted out before the next Debian release (a
new glibc & tcltk 8.4 should arrive into Testing in a couple of weeks,
e.g.). In the mean time I think most people running Debian Testing or
Unstable should be adventurous enough to add the change by hand if
compiling from source, thus a --nviz-pthread switch isn't needed (yet).

I tried compiling with gcc 2.95, but that didn't make any difference. I
can't test with an older glibc without ripping everything up so I won't.
Looks like threading in Tcl8.4 though. I wonder if the tcl8.4 others
have reported working was built with the without-threading option?

In respect to the NVIZ child process SegFaulting with integer-maps bug,
so far only thing I see in common is it works if glibc is older than
2.3.1, but that's not to say that's what it is. Can others on newer
systems check?

thanks for the insight,
Hamish

Bug #2:

> Turns out the raster I was testing NVIZ out with is a CELL map.
> Tried to run nviz on an integer based raster on a RedHat install &
> it gave the same SegFault.
> So NVIZ doesn't like integer based maps.
> ...
> building color table
> child killed: segmentation violation
> ...

Hamish I have been using nviz with CELL maps with no problems (e.g.
figs 5.1b,6.8a,7.2a in the book) and I tried it with the recent nviz
version from CVS right now and it runs OK too.

On Debian/testing(sarge), 5.0.3rc3:

- If I compile with gcc-2.95 (and the -lpthread fix) it works.

- If I compile with gcc-2.95 (without -lpthread) it gets past the point
of the SegFault, but hangs after "Adding panels". [same as gcc-3.3
without the -lpthread fix]

- If I compile with gcc-3.2.2 on Redhat 9 it SegFaults after "building
color table".

- If I compile with gcc-3.3 (with or without the -lpthread fix) it
SegFaults after "building color table".

I've got gcc-3.0 and gcc-3.2.3 installed too, so I'll see if I can
narrow down that end some more.

might newer gcc's not like ""?
   map = G_find_file2 ("cell", filename, "");
[src/libes/ogsf/Gs3.c Gs_build_256lookup()]

Hamish
(lots of recompiling today!)

Bug #2:

> > Turns out the raster I was testing NVIZ out with is a CELL map.
> > Tried to run nviz on an integer based raster on a RedHat install &
> > it gave the same SegFault.
> > So NVIZ doesn't like integer based maps.
> > ...
> > building color table
> > child killed: segmentation violation
> > ...
>
> Hamish I have been using nviz with CELL maps with no problems (e.g.
> figs 5.1b,6.8a,7.2a in the book) and I tried it with the recent nviz
> version from CVS right now and it runs OK too.

On Debian/testing(sarge), 5.0.3rc3:

- If I compile with gcc-2.95 (and the -lpthread fix) it works.

- If I compile with gcc-2.95 (without -lpthread) it gets past the point
of the SegFault, but hangs after "Adding panels". [same as gcc-3.3
without the -lpthread fix]

- If I compile with gcc-3.2.2 on Redhat 9 it SegFaults after "building
color table".

- If I compile with gcc-3.3.1 (with or without the -lpthread fix) it
SegFaults after "building color table".

I've got gcc-3.0 and gcc-3.2.3 installed too, so I'll see if I can
narrow down that end some more.

Breaks with gcc 3.0.4 ...
makes it past src/libes/ogsf/Gs3.c's Gs_build_256lookup() too.

gcc-3.3.1 with -DDEGUG_MSG and some printf()'s thrown in and it works.
Hmmmm.. & on the re-compiling goes..

Hamish

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hamish wrote:

But back to Debian with TclTk 8.4 and the bug as reported in
http://article.gmane.org/gmane.comp.gis.grass.devel/1883

"it does freeze up (0% cpu) at the first instance of "exec"
in etc/nviz2.2/scripts/nviz2.2_script, line 78: (Adding panels
section) set panels [exec cat $index]

(I tried placing some exec's further up in the script & they did
the same lock-up)"

G: One thing which might make a difference: try adding "-lpthread"
to the link command. I've occasionally encountered all sorts of
odd behaviour with programs which are linked against libraries
which are linked against libpthread (typically libGL), but where
the binary itself isn't linked directly against libpthread.

...

(added to src.contrib/GMSL/NVIZ2.2/src/Gmakefile XTRA_LDFLAGS ?)

That works(!)

Damn.

I have absolutely no idea how we could make configure detect this
situation.

...

The only easy solution is to provide a switch, e.g. --nviz-pthread,
which explicitly add -lpthread to the NVIZ linking switches. At least,
providing the switch is easy enough; dealing with user queries of the
form "what does this switch do and should I use it?" is likely to be
less straightforward.

That' still just treating the symptom.. (although that's possibly all we
can do from our end)

AFAIK, this is currently only a problem with Debian/Testing. Is anybody
running Gentoo or another bleeding edge distro who can try this? It
would be bad for the next version of SuSE/RedHat to come out with a new
glibc, tcl, etc & start showing this behaviour too.

Running on Mandrake cooker:

$ rpm -q glibc tcl
glibc-2.3.2-14mdk
tcl-8.4.2-1mdk

No problems with NVIZ, but I am running the NVIDIA drivers (which
replace libGL.so*). But it seems tcl/tcltk in Mandrake is built without
thread support.

Regards,
Buchan

- --
|--------------Another happy Mandrake Club member--------------|
Buchan Milne Mechanical Engineer, Network Manager
Cellphone * Work +27 82 472 2231 * +27 21 8828820x202
Stellenbosch Automotive Engineering http://www.cae.co.za
GPG Key http://ranger.dnsalias.com/bgmilne.asc
1024D/60D204A7 2919 E232 5610 A038 87B1 72D6 AC92 BA50 60D2 04A7
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE/ZvNErJK6UGDSBKcRAqB0AKCpNahDWvdmMJ9ZjQB4iSC5V2IcKQCgpTDE
5bKPvgS0/PRbfgVXL4gDt/Q=
=w6Q1
-----END PGP SIGNATURE-----

*****************************************************************
Please click on http://www.cae.co.za/disclaimer.htm to read our
e-mail disclaimer or send an e-mail to info@cae.co.za for a copy.
*****************************************************************

Bug #2:

> Turns out the raster I was testing NVIZ out with is a CELL map.
> Tried to run nviz on an integer based raster on a RedHat install &
> it gave the same SegFault.
> So NVIZ doesn't like integer based maps.
> ...
> building color table
> child killed: segmentation violation
> ...

I've currently got my integer-map bug tracked down to this line on
src/libes/ogsf/GS2.c: (line 1445)
  filename = G_fully_qualified_name(filename, mapset);

Next is to follow G_fully_qualified_name() in
src/libes/gis/nme_in_mps.c. Hopefully this leads somewhere.

May have found something, time to consult the C experts.

It apparently breaks while somewhere in this statement in
src/libes/gis/nme_in_mps.c:

if(strchr(name, '@'))
        sprintf (fullname, "%s", name);
    else
        sprintf (fullname, "%s@%s", name, mapset);

--
Looking up strchr:

char *strchr(const char *s, int c);

Description
This function returns a pointer to the first occurrence of c in s. Note
that if c is NULL, this will return a pointer to the end of the string.

Return Value
A pointer to the character, or NULL if it wasn't found.

So "if()" is testing either a pointer or NULL.

Is that kosher?

Hamish