not that it matters here, but do you use 32 bit or 64 bit
CPU+OS?
(on 8 Sptember 2009)
...
>> I have the following raster map:
>>
>> Type of Map: raster
Number of Categories: 255 Data
Type: CELL Rows:
58000 Columns:
67000 Total
Cells: 3886000000
>>
>> r.stats (-c and -p) tells me:
...
>> I guess there is an integer overflow somewhere(?)
Yeah, I think you are right.
There are many int++'s in raster/r.stats/stats.c
Perhaps those counters++ should be changed to "long long"?
what is the result of r.univar ?
I see r.univar has been modified to use int for number of cells
and number of non-null cells. (earlier versions used unsigned
int, why the change?) But 'unsigned int' for the counter just
postpones the problem a couple of years, maybe long long is
better there?
Like r.info, I think r.univar and r.stats are pretty fundamental
raster modules to have working correctly for this.
#32bit linux system/gcc
sizeof(long long) is 8
sizeof(long) is 4
sizeof(int) is 4
#64bit linux system/gcc
sizeof(long long) is 8
sizeof(long) is 8
sizeof(int) is 4
I notice r.info uses unsigned long long + printf %llu
Shall we standardize on that? If it can be done without
dealing with --enable-lfs all the better, I'm pretty sure
that LFS has more to do with which 64bit friendly glibc
functions to use so not relevant. But I'm not 100% sure
about this stuff so I ask.
I'm guessing that off_t should only be used for addresses,
or if all else fails can it be repurposed?
I notice r.info uses unsigned long long + printf %llu
Shall we standardize on that? If it can be done without
dealing with --enable-lfs all the better, I'm pretty sure
that LFS has more to do with which 64bit friendly glibc
functions to use so not relevant. But I'm not 100% sure
about this stuff so I ask.
> >> I guess there is an integer overflow somewhere(?)
Yeah, I think you are right.
There are many int++'s in raster/r.stats/stats.c
Perhaps those counters++ should be changed to "long long"?
"long long" is C99, but provided by some C89 compilers as an
extension. Any use must be conditionalised upon:
#ifdef HAVE_LONG_LONG_INT
Any "long long" values need to be printed with "%lld" ("%llu" for
unsigned), which also needs to be conditionalised.
I notice r.info uses unsigned long long + printf %llu
Shall we standardize on that?
The main downside is that you can end up needing a lot of conditional
code.
If it can be done without
dealing with --enable-lfs all the better, I'm pretty sure
that LFS has more to do with which 64bit friendly glibc
functions to use so not relevant. But I'm not 100% sure
about this stuff so I ask.
LFS has two parts: making off_t a 64-bit type, and redirecting I/O
functions to the 64-bit versions.
In order to make off_t a 64-bit type, there has to *be* a 64-bit type.
But that type will exist regardless of whether LFS is enabled.
I'm guessing that off_t should only be used for addresses,
or if all else fails can it be repurposed?
off_t should only be used for file offsets. There's no advantage of
using it as a general-purpose integer type, as you need to convert it
to a standard type for printing, so you still need to check for the
existence of "long long" to determine which type to convert it to.
> I notice r.info uses unsigned long long + printf %llu
> Shall we standardize on that?
Glynn wrote:
The main downside is that you can end up needing a lot of
conditional code.
I'm not seeing any alternative though.
For any module which does math with it (eg casting prior to
variable multiplication/division), perhaps to save some noise in
the code the LONGTYPE could be set up as a macro at the top
of the file and then the casts should look like
(LONGTYPE)chellhd.rows * cellhb*cols
setting up strfmt=LONGFMT for %llu or %lu is a bit uglier, but
a number of other modules pull such tricks to aviod CELL/FCELL/
DCELL switch statements.
Hamish wrote:
> > I notice r.info uses unsigned long long + printf %llu
> > Shall we standardize on that?
Glynn wrote:
> The main downside is that you can end up needing a lot of
> conditional code.
I'm not seeing any alternative though.
For any module which does math with it (eg casting prior to
variable multiplication/division), perhaps to save some noise in
the code the LONGTYPE could be set up as a macro at the top
of the file and then the casts should look like
(LONGTYPE)chellhd.rows * cellhb*cols
I suggest adding the following (or something similar) to config.h:
#ifdef HAVE_LONG_LONG_INT #define longest long long #define PRILONGEST_PREFIX "ll" #else #define longest long #define PRILONGEST_PREFIX "l" #endif
I'm unsure whether it's better to use a macro or a typedef for the
type. Using a macro allows the use of "unsigned longest" rather than
requiring a separate typedef (e.g. "u_longest").
I have no particular preference as to the name of the macro/type, or
whether to use upper or lower case for a macro (a typedef should use
lower case).
The PRI* syntax is intended to mirror the C99 macros from
<inttypes.h>. The definition doesn't include the leading "%", so that
you can include flags, a field width and/or a precision.