grass-dev@grass.itc.it wrote:
code I item #507, was opened at 2007-10-09 17:11
Status: Open
Priority: 3
Submitted By: Maciej Sieczka (msieczka)
Assigned to: Hamish Bowman (hamish)
Summary: r.in.xyz: a segfault with particular dataset
Issue type: module bug
Issue status: None
GRASS version: CVS HEAD
GRASS component: raster
Operating system: Linux
Operating system version: Ubuntu Dapper 64bit
GRASS CVS checkout date, if applies (YYMMDD): 071009Initial Comment:
r.in.xyz has been segfaulting with a particular dataset. It's quite
big (23 MB bzipped) and I don't know how to provide it but let me
know if ther's a need+way to.Details:
$r.in.xyz -g input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
n=5725102.149034 s=5705608.286234 e=587119.577053 w=568715.830892 b=69.431216 t= 224.696106
$ g.region n=5725102.149034 s=5705608.286234 e=587 119.577053 w=568715.830892 res=1 -a
If res == 1, then:
cols = 587119.577053 - 568715.830892 = 18403.74616099999 = 18404
rows = 5725102.149034 - 5705608.286234 = 19493.86280000024 = 19494
At 8 bytes per cell, the total size of the map is:
18404 * 19494 * 8 = 2870140608 bytes
This is larger than 2GiB, causing the arithmetic (which uses "int") to
overflow.
$ gdb r.in.xyz
$ run input=tin_nieciag.xyz output=tin_nie ciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100Starting program: /usr/local/grass-6.3.cvs/bin/r.in.xyz input=tin_nieciag.xyz output=tin_nieciag.xyz method=min type=DCELL fs="|" x=1 y=2 z=3 percent=100
Scanning data ...
67%
Program received signal SIGSEGV, Segmentation fault.
0x00002aaaaabf3c09 in G_is_d_null_value (dcellVal=0x2aaa2cd4e960) at null_val.c:502
502 if(((unsigned char *) dcellVal)[i] !=
(gdb) bt
#2 0x000000000040439c in update_min (array=0x2aaaab3b4010, cols=18405, row=14767, col=4767, map_type=2, value=155.11966910999999) at support.c:74
int update_min(void *array, int cols, int row, int col, RASTER_MAP_TYPE map_type, double value)
{
void *ptr;
DCELL old_val;
ptr = array;
ptr = G_incr_void_ptr(ptr, ((row*cols)+col)*G_raster_size(map_type));
if( G_is_null_value(ptr, map_type) )
There are two problems here:
1. row, col, and cols are all "int"s (typically 32-bit signed
integers), so (row*cols)+col is performed at that width, which will
overflow at 2GiB.
2. G_incr_void_ptr() takes the increment as an "int", which is limited
to 2GiB.
For this to work, we need to use a 64-bit type on 64-bit systems. We
also need to use an unsigned type, as a 32-bit system can allocate
memory up to 4GiB, not 2GiB (unlike off_t, size_t is unsigned).
Which types are 64-bit on 64-bit Linux systems? ptrdiff_t has to be;
size_t probably will be.
[I was going to point out that malloc() is limited to a size_t, so the
array couldn't be larger than that, but the array is allocated with
calloc(), which could theoretically produce a >4GiB array with only a
32-bit size_t.]
I suggest:
1. Changing G_incr_void_ptr() to use size_t.
2. Changing G_raster_size() to return size_t (that's what sizeof
returns, which is the reason for size_t's existence).
3. Casting "cols" to a size_t, so that the index calculation is
performed at size_t width.
If size_t isn't enough to index the array (i.e. you have a 32-bit
size_t on a platform which can calloc() more than 4GiB), you lose
regardless.
--
Glynn Clements <glynn@gclements.plus.com>