[GRASS-dev] G_put_raster_row and OpenMP

Hello,

I am trying to include OpenMP coding into raster processing so that each row
can be processed by a different process.

The problem faced is that while row number 1 and 2 are still processing, row
number 3 may already have an ouput for wirting in the output file. Therefore
we would need to specify the row number to write the ouput row to.

Right now,

int G_put_raster_row (int fd, const void * buf, RASTER_MAP_TYPE data_type)

does not include row number specification. So I tried to find other functions
in put_row.c:

int G_put_map_row_random (int fd, const CELL * buf, int row, int col, int n)

This function is giving control to the data (row,col) location. However it
writes only CELL type of data, which is limiting a lot what is possible to do
with it in remote sensing processing.

Ideally, what we would be looking for is a G_put_raster_random_row() function,
giving only the row input location would be fine actually, since distributed
row processing is faster than pixel-wise.

I dont know what are the plans or interests to produce raster code that goes
this way, but this is something simple I believe would help OpenMP coding.

If people dont want to do it, I could try doing it.

If I missed a major concept, please do not hesitate to correct me :slight_smile:

Yann

Yann wrote:

I am trying to include OpenMP coding into raster processing so that each row
can be processed by a different process.

The problem faced is that while row number 1 and 2 are still processing, row
number 3 may already have an ouput for wirting in the output file. Therefore
we would need to specify the row number to write the ouput row to.

Right now,

int G_put_raster_row (int fd, const void * buf, RASTER_MAP_TYPE data_type)

does not include row number specification. So I tried to find other functions
in put_row.c:

int G_put_map_row_random (int fd, const CELL * buf, int row, int col, int n)

This function is giving control to the data (row,col) location. However it
writes only CELL type of data, which is limiting a lot what is possible to do
with it in remote sensing processing.

Ideally, what we would be looking for is a G_put_raster_random_row() function,
giving only the row input location would be fine actually, since distributed
row processing is faster than pixel-wise.

I dont know what are the plans or interests to produce raster code that goes
this way, but this is something simple I believe would help OpenMP coding.

If people dont want to do it, I could try doing it.

If I missed a major concept, please do not hesitate to correct me :slight_smile:

I don't know if it's "major", but one thing to bear in mind is that
the map must have been opened with G_open_cell_new_random() in order
to be able to use that function.

This has several consequences: the map is stored uncompressed, it
cannot contain nulls, and it cannot be floating-point (essentially,
when FP and null values were added in 5.x, the random-access case
wasn't upgraded).

Neither of these factors would be particularly hard to change.

The raster file contains a table of row pointers, so the rows don't
actually have to be stored in sequence in the file, even if they're
compressed. They could be written out in an arbitrary order (although
not concurrently).

The null bitmap isn't compressed, so it would be possible to write out
rows in an arbitrary order, or even concurrently. The existing code
can't handle this because it caches a block of consecutive rows and
writes them all out at once. Changing this would introduce
inefficiency, as the null file is opened and closed for each write (to
avoid doubling the number of open file descriptors).

--
Glynn Clements <glynn@gclements.plus.com>

On Friday 30 November 2007 12:13:09 Glynn Clements wrote:

I don't know if it's "major", but one thing to bear in mind is that
the map must have been opened with G_open_cell_new_random() in order
to be able to use that function.

OK, this is noted, it means I have to create a function to open a floating
point cell: G_open_fp_cell_new_random().

In opencell.c, G_open_cell_new_random() returns a call to
G__open_raster_new(name, type of Open). Where is this function located
please, and does it need modification in this case?

This has several consequences: the map is stored uncompressed, it
cannot contain nulls, and it cannot be floating-point (essentially,
when FP and null values were added in 5.x, the random-access case
wasn't upgraded).

Neither of these factors would be particularly hard to change.

Could you direct me a bit on changing those?

The raster file contains a table of row pointers, so the rows don't
actually have to be stored in sequence in the file, even if they're
compressed. They could be written out in an arbitrary order (although
not concurrently).

The null bitmap isn't compressed, so it would be possible to write out
rows in an arbitrary order, or even concurrently. The existing code
can't handle this because it caches a block of consecutive rows and
writes them all out at once. Changing this would introduce
inefficiency, as the null file is opened and closed for each write (to
avoid doubling the number of open file descriptors).

Thank you,
Yann

--
Yann Chemin
International Rice Research Institute
Los Banos, Laguna
The Philippines

Yann wrote:

> I don't know if it's "major", but one thing to bear in mind is that
> the map must have been opened with G_open_cell_new_random() in order
> to be able to use that function.

OK, this is noted, it means I have to create a function to open a floating
point cell: G_open_fp_cell_new_random().

In opencell.c, G_open_cell_new_random() returns a call to
G__open_raster_new(name, type of Open). Where is this function located
please,

It's also in opencell.c, line 605 in the CVS HEAD version.

and does it need modification in this case?

Maybe. You may also need to change the code in put_row.c; there's no
guarantee that it allows for the combination of random access and
floating-point.

> This has several consequences: the map is stored uncompressed, it
> cannot contain nulls, and it cannot be floating-point (essentially,
> when FP and null values were added in 5.x, the random-access case
> wasn't upgraded).
>
> Neither of these factors would be particularly hard to change.

Could you direct me a bit on changing those?

Mostly you just need to check that put_row.c allows for all of the
cases.

Regarding the buffering of null values, change NULL_ROWS_INMEM (in
lib/gis/G.h) to 1.

One thing to bear in mind with compressed output is that overwriting a
row won't remove the data for the old row from the file.

--
Glynn Clements <glynn@gclements.plus.com>

[Please keep discussions on the mailing list.]

Yann wrote:

> > > This has several consequences: the map is stored uncompressed, it
> > > cannot contain nulls, and it cannot be floating-point (essentially,
> > > when FP and null values were added in 5.x, the random-access case
> > > wasn't upgraded).
> > >
> > > Neither of these factors would be particularly hard to change.
> >
> > Could you direct me a bit on changing those?
>
> Mostly you just need to check that put_row.c allows for all of the
> cases.

int G_put_fp_map_row_random(int fd, const void *buf, int row, int col, int
n,RASTER_MAP_TYPE data_type)

I suggest a generic G_put_raster_row_random() which works with any
data type.

Also, as a matter of convention, functions _raster_ raster in their
names correspond to the new raster API (which supports FP and null),
while those which use _map_ correspond to the old API (integer only,
zero is null).

{
    struct fileinfo *fcb = &G__.fileinfo[fd];
    if (!check_open("G_put_fp_map_row_random", fd, 1))
        return -1;
    buf += adjust(fd, &col, &n);
    switch (put_fp_data(fd, buf, row, col, n, data_type))
    {
    case -1: return -1;
    case 0: return 1;
    }
    G_row_update_range (buf, n, &fcb->range);
    return 1;
}

int static put_fp_data(int fd, const void *rast, int row, int col, int n,
RASTER_MAP_TYPE data_type) in line 456 of put_row.c seems to handle
uncompressed fcell and dcell.

Note that put_fp_data() only writes the data, not the null bitmap.

--
Glynn Clements <glynn@gclements.plus.com>