[GRASS-dev] Allocating memory for a huge array

Hi,
in a Grass module I am developing, I would like to allocate a huge array. I thought to use something like:

unsigned int * H_Array;
(…)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.
At compile-time, I receive a warning:
warning: passing arg 1 of `G_calloc’ as signed due to prototype

Thus, I realize that I am not succeeding in what I want!
Is there a different G_calloc function that provides a prototype suitable for my needs?
Thank you in advance,
Damiano

Hello Damiano

On Thu, 22 Mar 2007, Damiano Triglione wrote:

Hi,
in a Grass module I am developing, I would like to allocate a huge array. I thought to use something like:

unsigned int * H_Array;
(...)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.

I think you're confusing the size of the array data type with the size of the array index. The array index is always a pointer of size size_t and can get as big as the system allows a size_t to be.

On my Windows system anyway, an unsigned int uses exactly the same amount
of memory as an int (4 bytes) so I can't see what you're gaining there. If
you want to store twice as much data in the same amount of memory then you
need to use a smaller data type, such as a short or unsigned short (2 bytes).

At compile-time, I receive a warning:
warning: passing arg 1 of `G_calloc' as signed due to prototype

Thus, I realize that I am not succeeding in what I want!
Is there a different G_calloc function that provides a prototype suitable for my needs?

I'm not entirely clear on what you want to do - but hope the notes above help you formulate your problem more clearly. Perhaps!

Paul

On Thu, 22 Mar 2007, Paul Kelly wrote:

unsigned int * H_Array;
(...)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.

I think you're confusing the size of the array data type with the size of the array index. The array index is always a pointer of size size_t and can get as big as the system allows a size_t to be.

Sorry just thought of a way I could explain this more clearly. The amount
of values you can have in the array is equal to sizeof(unsigned int *). This is exactly the same value as sizeof(int *), sizeof(short *), even sizeof(char *) - they are all pointers of size size_t. In other words, nothing to do with the data type that is actually stored in the array. I used to be terribly confused about this and indeed introduced some bugs into GRASS because of it* but am mostly over my confusion now :wink:

Paul

* On 32-bit operating systems a size_t is 32 bits, but on 64-bit operating systems it is 64 bits - make sense? So maybe an AMD64 and a 64-bit compatible operating system is the solution to your problem?

Damiano Triglione wrote:

in a Grass module I am developing, I would like to allocate a huge array. I thought to use something like:

unsigned int * H_Array;
(...)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.
At compile-time, I receive a warning:
warning: passing arg 1 of `G_calloc' as signed due to prototype

Thus, I realize that I am not succeeding in what I want!
Is there a different G_calloc function that provides a prototype suitable for my needs?

G_calloc() takes size_t arguments, and isn't the problem here.

There are two distinct problems with the line:

  H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

First, and most importantly, C's "^" operator performs bitwise
exclusive-or (XOR), not exponentiation. Maybe you meant:

  H_Array=(unsigned int *)G_calloc(1<<32, sizeof(unsigned int));
?

But even that is wrong; if you have a 32-bit "int" type (and almost
every OS in common use does), 1<<32 == 0 according to the rules of C.
Even if size_t is a 64-bit type (which is quite likely on a 64-bit
system). The reason is that:

1. C types propagate from the bottom up; an expressions comprised
solely of "int" values will be truncated to an "int", even if the
result is assigned to a variable or parameter of a wider type (e.g.
"long").

2. Undecorated integer literals have type "int"; if you want a long
value, you need e.g. 1L.

The following should work if your platform's "size_t" is 64 bits:

  H_Array=(unsigned int *)G_calloc(1L<<32, sizeof(unsigned int));

Finally:

1. As Paul has pointed out, the number of entries and their types are
unrelated.

2. If you are on a 32-bit platform, there is no way that you can have
an array containing 2^32 "int"s, or even 2^32 bytes; The process'
address space is limited to 2^32 bytes in total, and some of that is
already taken. Even if you have more than 4GiB or RAM and/or swap, you
can't have more than 4GiB for an individual process.

--
Glynn Clements <glynn@gclements.plus.com>

Dear Paul and Glenn,
yes: in order to be quick and short, I badly explained myself.
Your advices are all interesting and correct, but I need to be more precise about my case:

I do not want a Huge Array, but a standard array (S_Array) whose values are indices to a Huge Array (H_Array)!
Besides, 2^32 is not the value I typed in my source file, but it was an expression just to let you understand the value 4 billions.

Sorry for my faults!

So, my intention is, actually:

unsigned int * S_Array;
double * H_Array;
(...)
S_Array=(unsigned int *)G_calloc(100, sizeof(unsigned int));
H_Array=(double *)G_calloc(1L<<32, sizeof(double));

Does it make sense, now?
Damiano

----- Original Message ----- From: "Glynn Clements" <glynn@gclements.plus.com>
To: "Damiano Triglione" <damiano.triglione@polimi.it>
Cc: <grass-dev@grass.itc.it>
Sent: Friday, March 23, 2007 6:23 AM
Subject: Re: [GRASS-dev] Allocating memory for a huge array

Damiano Triglione wrote:

in a Grass module I am developing, I would like to allocate a huge array. I thought to use something like:

unsigned int * H_Array;
(...)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.
At compile-time, I receive a warning:
warning: passing arg 1 of `G_calloc' as signed due to prototype

Thus, I realize that I am not succeeding in what I want!
Is there a different G_calloc function that provides a prototype suitable for my needs?

G_calloc() takes size_t arguments, and isn't the problem here.

There are two distinct problems with the line:

H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

First, and most importantly, C's "^" operator performs bitwise
exclusive-or (XOR), not exponentiation. Maybe you meant:

H_Array=(unsigned int *)G_calloc(1<<32, sizeof(unsigned int));
?

But even that is wrong; if you have a 32-bit "int" type (and almost
every OS in common use does), 1<<32 == 0 according to the rules of C.
Even if size_t is a 64-bit type (which is quite likely on a 64-bit
system). The reason is that:

1. C types propagate from the bottom up; an expressions comprised
solely of "int" values will be truncated to an "int", even if the
result is assigned to a variable or parameter of a wider type (e.g.
"long").

2. Undecorated integer literals have type "int"; if you want a long
value, you need e.g. 1L.

The following should work if your platform's "size_t" is 64 bits:

H_Array=(unsigned int *)G_calloc(1L<<32, sizeof(unsigned int));

Finally:

1. As Paul has pointed out, the number of entries and their types are
unrelated.

2. If you are on a 32-bit platform, there is no way that you can have
an array containing 2^32 "int"s, or even 2^32 bytes; The process'
address space is limited to 2^32 bytes in total, and some of that is
already taken. Even if you have more than 4GiB or RAM and/or swap, you
can't have more than 4GiB for an individual process.

--
Glynn Clements <glynn@gclements.plus.com>

On Fri, 23 Mar 2007, Damiano Triglione wrote:

Dear Paul and Glenn,
yes: in order to be quick and short, I badly explained myself.
Your advices are all interesting and correct, but I need to be more precise about my case:

I do not want a Huge Array, but a standard array (S_Array) whose values are indices to a Huge Array (H_Array)!
Besides, 2^32 is not the value I typed in my source file, but it was an expression just to let you understand the value 4 billions.

Sorry for my faults!

So, my intention is, actually:

unsigned int * S_Array;
double * H_Array;
(...)
S_Array=(unsigned int *)G_calloc(100, sizeof(unsigned int));

Why not:
size_t *S_Array;
...
S_Array=(size_t *)G_calloc(100, sizeof(size_t));

?

As I see it you are relying on the fact that sizeof(unsigned int) is the same as sizeof(size_t), which won't be true on all platforms (specifically: on 64-bit size_t will be 64 bits wide whereas int/unsigned int will be only 32 in most cases).

H_Array=(double *)G_calloc(1L<<32, sizeof(double));

Does it make sense, now?
Damiano

----- Original Message ----- From: "Glynn Clements" <glynn@gclements.plus.com>
To: "Damiano Triglione" <damiano.triglione@polimi.it>
Cc: <grass-dev@grass.itc.it>
Sent: Friday, March 23, 2007 6:23 AM
Subject: Re: [GRASS-dev] Allocating memory for a huge array

Damiano Triglione wrote:

in a Grass module I am developing, I would like to allocate a huge array. I thought to use something like:

unsigned int * H_Array;
(...)
H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

because in this way (unsigned int) with 32 bits I can manage 4 billions cells, instead of 2 billions related to 32 bits as (traditional signed) int.
At compile-time, I receive a warning:
warning: passing arg 1 of `G_calloc' as signed due to prototype

Thus, I realize that I am not succeeding in what I want!
Is there a different G_calloc function that provides a prototype suitable for my needs?

G_calloc() takes size_t arguments, and isn't the problem here.

There are two distinct problems with the line:

H_Array=(unsigned int *)G_calloc(2^32, sizeof(unsigned int));

First, and most importantly, C's "^" operator performs bitwise
exclusive-or (XOR), not exponentiation. Maybe you meant:

H_Array=(unsigned int *)G_calloc(1<<32, sizeof(unsigned int));
?

But even that is wrong; if you have a 32-bit "int" type (and almost
every OS in common use does), 1<<32 == 0 according to the rules of C.
Even if size_t is a 64-bit type (which is quite likely on a 64-bit
system). The reason is that:

1. C types propagate from the bottom up; an expressions comprised
solely of "int" values will be truncated to an "int", even if the
result is assigned to a variable or parameter of a wider type (e.g.
"long").

2. Undecorated integer literals have type "int"; if you want a long
value, you need e.g. 1L.

The following should work if your platform's "size_t" is 64 bits:

H_Array=(unsigned int *)G_calloc(1L<<32, sizeof(unsigned int));

Finally:

1. As Paul has pointed out, the number of entries and their types are
unrelated.

2. If you are on a 32-bit platform, there is no way that you can have
an array containing 2^32 "int"s, or even 2^32 bytes; The process'
address space is limited to 2^32 bytes in total, and some of that is
already taken. Even if you have more than 4GiB or RAM and/or swap, you
can't have more than 4GiB for an individual process.

--
Glynn Clements <glynn@gclements.plus.com>