[GRASSLIST:7976] error reading large rasters in 5.4/6.0

I'm having problems working with some large raster files in GRASS, even
though I have enabled large file support.

I built GRASS from source

my 5.4 config

CFLAGS="-g -O2 -D_FILE_OFFSET_BITS=64
-D_LARGE_FILE_SOURCE" ../grass-5.4.0/configure
--with-tcltk-includes=/usr/include/tcl8.3 --without-postgres
--with-mysql --with-mysql-includes=/usr/include/mysql --with-cxx

my 6.0 config

CXX='g++-3.3' CXXFLAGS='-g -O2 -D_LARGE_FILE_SOURCE
-D_FILE_OFFSET_BITS=64' CFLAGS='-g -O2 -D_LARGE_FILE_SOURCE
-D_FILE_OFFSET_BITS=64' ./configure
--with-tcltk-includes=/usr/include/tcl/ --without-postgres --with-mysql
--with-mysql-includes=/usr/include/mysql/ --with-cxx

I get the following problems:

1) "r.info test" displays the wrong number of total cells. e.g.,

| Type of Map: cell
| Data Type: DCELL
| Rows: 50590
| Columns: 71840
| Total Cells: -660581696
| Projection: Lambert Conformal Conic (zone 0)
| N: 967200 S: 461300 Res: 10
| E: 2638600 W: 1920200 Res: 10

2) "d.rast test" complains.
WARNING: error reading compressed map [test] in mapset [local], row 0

sometimes the image displays fine even after the warning, on another
data set I get a green bar at the top and if I change the region,
sometimes I don't get the message at all. For example, if I set my
region to be

g.region rast=test

I get the error using d.rast, but if I expand the northing by 10 feet
(one cell), I get no error.

Even though the image tends to display fine and look correct, the
problem causes some programs to quit, e.g. r.mapcalc.

r.mapcalc test2=test
ERROR: error reading compressed map [Neuse10] in mapset [local], row 0

The problem seems to be on rasters whose fcell file is bigger than 4GB.

Comments/ideas/fixes appreciated.

-Andy

Andrew Danner wrote:

I'm having problems working with some large raster files in GRASS, even
though I have enabled large file support.

I built GRASS from source

my 5.4 config

CFLAGS="-g -O2 -D_FILE_OFFSET_BITS=64
-D_LARGE_FILE_SOURCE" ../grass-5.4.0/configure
--with-tcltk-includes=/usr/include/tcl8.3 --without-postgres
--with-mysql --with-mysql-includes=/usr/include/mysql --with-cxx

my 6.0 config

CXX='g++-3.3' CXXFLAGS='-g -O2 -D_LARGE_FILE_SOURCE
-D_FILE_OFFSET_BITS=64' CFLAGS='-g -O2 -D_LARGE_FILE_SOURCE
-D_FILE_OFFSET_BITS=64' ./configure
--with-tcltk-includes=/usr/include/tcl/ --without-postgres --with-mysql
--with-mysql-includes=/usr/include/mysql/ --with-cxx

I get the following problems:

1) "r.info test" displays the wrong number of total cells. e.g.,

Right. Even if you enable large file support, many values are still
limited to the range of a signed 32-bit integer (i.e. 2^31-1). This
will start to manifest itself on maps with more than 2^31-1 cells;
although most operations should still work, you will get oddities such
as r.info displaying a negative cell count.

Unfortunately, this isn't likely to change any time soon; there are
too many individual cases which would need to be changed.

r.mapcalc test2=test
ERROR: error reading compressed map [Neuse10] in mapset [local], row 0

The problem seems to be on rasters whose fcell file is bigger than 4GB.

No idea on this. Unfortunately, I don't have 4GB of free disk space
right now to test it.

--
Glynn Clements <glynn@gclements.plus.com>

On Fri, 2005-08-19 at 19:01 +0100, Glynn Clements wrote:

Andrew Danner wrote:

> I'm having problems working with some large raster files in GRASS, even
> though I have enabled large file support.
>
>
> I get the following problems:
>
> 1) "r.info test" displays the wrong number of total cells. e.g.,

Right. Even if you enable large file support, many values are still
limited to the range of a signed 32-bit integer (i.e. 2^31-1). This
will start to manifest itself on maps with more than 2^31-1 cells;
although most operations should still work, you will get oddities such
as r.info displaying a negative cell count.

Unfortunately, this isn't likely to change any time soon; there are
too many individual cases which would need to be changed.

If there is an accepted way to fix this, I'll offer to patch up some of
the affected programs. r.info is an easy fix--one line in a printf. The
only issue is what is the most portable way to print a 64-bit int?

current:
sprintf (line, " Total Cells: %ld", (long)cellhd.rows * cellhd.cols);

my standard Linux way
sprintf (line, " Total Cells: %lld", ((off_t)cellhd.rows) *
                                                   cellhd.cols);

A more portable way?
sprintf (line, " Total Cells: %" PRId64, (off_t)cellhd.rows) *
                                                   cellhd.cols);

> r.mapcalc test2=test
> ERROR: error reading compressed map [Neuse10] in mapset [local], row 0
>
> The problem seems to be on rasters whose fcell file is bigger than 4GB.

No idea on this. Unfortunately, I don't have 4GB of free disk space
right now to test it.

I'll look into this further..

-Andy

After looking into the problem of reading very large rasters, I believe
I have isolated the bug. The problem only happens when reading row 0 and
triggers the following block in flate.c:186

else if (b[0] != G_ZLIB_COMPRESSED_YES)
    {
        /* We're not at the start of a row */
        G_free (b);
        return -1;
    }

I think the problem is a bug in writing rasters, not reading them. When
creating a new raster, it writes a vector of row offsets to the fcell
file in the function G__write_row_ptrs. It attempts to guess the size of
the file offsets.

int nbytes = sizeof(off_t);
...
if (nbytes > 4 && fcb->row_ptr[nrows] <= 0xffffffff)
    nbytes = 4;
...
len = (nrows + 1) * nbytes + 1;
b = buf = G_malloc(len);

The problem is that when a new raster is created fcb->row_ptr[nrows]=0,
so nbytes is always 4. Later when writing the rows and actually
computing the offsets, if fcb->row_ptr[nrows] > 0xffffffff, it bumps
nbytes upto 8, but row 0 was written to the file assuming an offset of 4
and row 0 becomes corrupted if the raster is larger than 4GB.

Commenting out the "if" block seems to fix the problem.

Should I submit a bug report?

-Andy

On Thu, 2005-08-18 at 18:01 -0400, Andrew Danner wrote:

I'm having problems working with some large raster files in GRASS, even
though I have enabled large file support.

I built GRASS from source

I get the following problems:

1) "r.info test" displays the wrong number of total cells. e.g.,

> | Type of Map: cell
> | Data Type: DCELL
> | Rows: 50590
> | Columns: 71840
> | Total Cells: -660581696
> | Projection: Lambert Conformal Conic (zone 0)
> | N: 967200 S: 461300 Res: 10
> | E: 2638600 W: 1920200 Res: 10

2) "d.rast test" complains.
WARNING: error reading compressed map [test] in mapset [local], row 0

Even though the image tends to display fine and look correct, the
problem causes some programs to quit, e.g. r.mapcalc.

r.mapcalc test2=test
ERROR: error reading compressed map [Neuse10] in mapset [local], row 0

The problem seems to be on rasters whose fcell file is bigger than 4GB.

Andrew Danner wrote:

> > I'm having problems working with some large raster files in GRASS, even
> > though I have enabled large file support.
> >
> >
> > I get the following problems:
> >
> > 1) "r.info test" displays the wrong number of total cells. e.g.,
>
> Right. Even if you enable large file support, many values are still
> limited to the range of a signed 32-bit integer (i.e. 2^31-1). This
> will start to manifest itself on maps with more than 2^31-1 cells;
> although most operations should still work, you will get oddities such
> as r.info displaying a negative cell count.
>
> Unfortunately, this isn't likely to change any time soon; there are
> too many individual cases which would need to be changed.

If there is an accepted way to fix this, I'll offer to patch up some of
the affected programs. r.info is an easy fix--one line in a printf.

Not quite. You also have to perform the calculation in 64 bits. Right
now, the only code anywhere which uses 64-bit types on a 32-bit system
(where "long" is 32 bits) is the low-level raster I/O code in lib/gis
(opencell.c, get_row.c, put_row.c and format.c, and G.h), which uses
off_t, which may be 64 bits depending upon compiler switches.

Code which counts cells will use 32-bit arithmetic, which will
overflow on large maps.

The only issue is what is the most portable way to print a 64-bit
int?

C99 has <inttypes.h> which defines macros for printf/scanf format
specifiers for various integer types, e.g. PRId64 for int64_t. This
will expand to either "ld" or "lld" depending up on whether int64_t is
an alias for "long" or "long long".

If the compiler doesn't support C99, you're out of luck. It may not
have a 64-bit integer type, or be able to print such.

current:
sprintf (line, " Total Cells: %ld", (long)cellhd.rows * cellhd.cols);

my standard Linux way
sprintf (line, " Total Cells: %lld", ((off_t)cellhd.rows) *
                                                   cellhd.cols);

A more portable way?
sprintf (line, " Total Cells: %" PRId64, (off_t)cellhd.rows) *
                                                   cellhd.cols);

1. You aren't guaranteed that the platform provides the C99 PRI*
macros.

2. You aren't guaranteed that off_t is 64 bits; large file support is
optional, and needs to remain so until we fix all of the file I/O
code, not just the low-level raster I/O code (or find a way to
implement it such that only code which has been updated to handle
large files will use them). Currently, I see 103 files which use
lseek, fseek or ftell.

3. Ideally, it should work regardless of whether large file support is
enabled. Due to compression, you could have a map with more than 2^31
cells even if you can't have a file larger than 2GiB (i.e. off_t is 32
bits).

I think that we should sort out these issues generally before anyone
starts trying to fix specific pieces of code.

--
Glynn Clements <glynn@gclements.plus.com>

Andrew Danner wrote:

After looking into the problem of reading very large rasters, I believe
I have isolated the bug. The problem only happens when reading row 0 and
triggers the following block in flate.c:186

else if (b[0] != G_ZLIB_COMPRESSED_YES)
    {
        /* We're not at the start of a row */
        G_free (b);
        return -1;
    }

I think the problem is a bug in writing rasters, not reading them. When
creating a new raster, it writes a vector of row offsets to the fcell
file in the function G__write_row_ptrs. It attempts to guess the size of
the file offsets.

int nbytes = sizeof(off_t);
...
if (nbytes > 4 && fcb->row_ptr[nrows] <= 0xffffffff)
    nbytes = 4;
...
len = (nrows + 1) * nbytes + 1;
b = buf = G_malloc(len);

The problem is that when a new raster is created fcb->row_ptr[nrows]=0,
so nbytes is always 4. Later when writing the rows and actually
computing the offsets, if fcb->row_ptr[nrows] > 0xffffffff, it bumps
nbytes upto 8, but row 0 was written to the file assuming an offset of 4
and row 0 becomes corrupted if the raster is larger than 4GB.

Right. I'd overlooked the fact that G_write_row_ptrs() gets called
twice.

Commenting out the "if" block seems to fix the problem.

The reader will handle 8-byte offsets even if off_t is only 4 bytes so
long as the offsets themselves don't exceed the 32-bit range.

There isn't any practical way to determine in advance whether the
offsets will fit into 4 bytes, so removing the "if" block is the right
fix.

Should I submit a bug report?

No need; I've comitted the fix to CVS.

One other bug which I've just noticed: the reader checks whether an
offset exceeds the range of an off_t, but only in the sense that it
checks whether it occupies too many bytes.

This check is wrong for files between 2GiB and 4GiB, where the offset
fits into 4 bytes, but exceeds the 31-bit (signed) range of an off_t.
There needs to be an "if (offset < 0)" check in there somewhere.

--
Glynn Clements <glynn@gclements.plus.com>

On Tue, Aug 23, 2005 at 10:25:05PM +0100, Glynn Clements wrote:

Andrew Danner wrote:

...

> I think the problem is a bug in writing rasters, not reading them. When
> creating a new raster, it writes a vector of row offsets to the fcell
> file in the function G__write_row_ptrs. It attempts to guess the size of
> the file offsets.
>
> int nbytes = sizeof(off_t);
> ...
> if (nbytes > 4 && fcb->row_ptr[nrows] <= 0xffffffff)
> nbytes = 4;
> ...
> len = (nrows + 1) * nbytes + 1;
> b = buf = G_malloc(len);
>

...

> Should I submit a bug report?

No need; I've comitted the fix to CVS.

...

I have fixed that in the 6.0-release branch as well
for a potential 6.0.2 (even if LFS is not well developed in 6.0.x).

Markus

Markus Neteler wrote:

> > I think the problem is a bug in writing rasters, not reading them. When
> > creating a new raster, it writes a vector of row offsets to the fcell
> > file in the function G__write_row_ptrs. It attempts to guess the size of
> > the file offsets.
> >
> > int nbytes = sizeof(off_t);
> > ...
> > if (nbytes > 4 && fcb->row_ptr[nrows] <= 0xffffffff)
> > nbytes = 4;
> > ...
> > len = (nrows + 1) * nbytes + 1;
> > b = buf = G_malloc(len);
> >
...
> > Should I submit a bug report?
>
> No need; I've comitted the fix to CVS.
>
...

I have fixed that in the 6.0-release branch as well
for a potential 6.0.2 (even if LFS is not well developed in 6.0.x).

Apart from LFS support, it's still useful for systems where
sizeof(long) == 8, where there shouldn't be any problems with files
larger than 2GiB or maps with more than 2^31 cells.

The problems all stem from trying to deal with more bytes or more
cells than will fit into a "long".

--
Glynn Clements <glynn@gclements.plus.com>