[GRASS-dev] question about the concept 'window'

Hi

In function static int compute_window_row(int fd, int row, int *cellRow) in ~/grass/7.0/grass_trunk/lib/raster/get_row.c, there is a check to see if row is in window, and if not, 0 is returned.

I came across this function from get_map_row_nomask(...) in the same file (which came all the way from Rast_get_row(fds[i], rowbuf, row, maptype);), and the result of compute_window_row(...) is used to determined whether to "read cell file row if not in memory".

I'm a little confused about the concept 'window' here, and how does that relate to "in memory"?

My understanding is that "window" is a "visible area", and rows are only read from disk file if it's within this "window" area. Then what would happen if rast_get_row is used to get the exact same row in exact the same map more than once? Does the second time NOT read the row from file because the first time has put this row in memory? In this case the requested row is copied from a inner buffer that was filled with the data during the first time call?

Thanks,
Peng

Peng Du wrote:

In function static int compute_window_row(int fd, int row, int *cellRow)
in ~/grass/7.0/grass_trunk/lib/raster/get_row.c, there is a check to see
if row is in window, and if not, 0 is returned.

I came across this function from get_map_row_nomask(...) in the same
file (which came all the way from Rast_get_row(fds[i], rowbuf, row,
maptype);), and the result of compute_window_row(...) is used to
determined whether to "read cell file row if not in memory".

I'm a little confused about the concept 'window' here, and how does that
relate to "in memory"?

The "window", also referred to as the "region", defines a grid in
cartographic space. It consists of a rectangle defined by the north,
south, east and west bounds, and the number of rows and columns.

All input raster maps are resampled, cropped and/or padded with nulls
according to the window. This allows a module to read multiple raster
maps which may have differing bounds and resolutions without having to
handle the details explicitly.

My understanding is that "window" is a "visible area", and rows are only
read from disk file if it's within this "window" area. Then what would
happen if rast_get_row is used to get the exact same row in exact the
same map more than once? Does the second time NOT read the row from file
because the first time has put this row in memory? In this case the
requested row is copied from a inner buffer that was filled with the
data during the first time call?

Correct.

The file lib/raster/R.h defines the core data structures used by the
raster library. R__.fileinfo points to an array of fileinfo
structures, indexed by the map's file descriptor (this is the actual
Unix file descriptor of the cell/fcell file).

Within the fileinfo structure, the "data" field points to the data for
the current row, after decompression but before resampling or
conversion. The "cur_row" field contains the number of the row to
which the data corresponds.

get_map_row_nomask() (which all of the higher-level "get row"
functions use) checks whether the row being requested is equal to the
row which is stored; if it is, it doesnt' bother to read the row from
disc again.

If an input map has a larger (coarser) north-south resolution than the
current region, adjacent rows within the current region will often map
to the same row in the underlying map. By caching the most recent row
in the fileinfo structure, redundant disc access is avoided.

Note that the "processed" data cannot be cached, as it may depend upon
a MASK map, which can have a finer resolution than the map itself.
Also, it's permitted to change the window while a map is being read,
and it's also permitted to change the requested format (CELL, FCELL or
DCELL).

If the processed data was cached, it would be necessary to store
additional information about the cached data in order to determine
whether it can be re-used. Storing the "raw" data from the file
doesn't have these problems, as the only processing (the
decompression) doesn't depend upon any parameters or settings.

--
Glynn Clements <glynn@gclements.plus.com>

Thanks Glynn. Now I have a much better picture of this.

Another question: is there any function similar to get_map_row(…) but gets a column of data from raster image rather than a row?

Thanks,
Peng

On 7/18/2010 3:34 PM, Glynn Clements wrote:On 7/18/2010 3:34 PM, Glynn Clements wrote:

Peng Du wrote:

In function static int compute_window_row(int fd, int row, int *cellRow) 
in ~/grass/7.0/grass_trunk/lib/raster/get_row.c, there is a check to see 
if row is in window, and if not, 0 is returned.

I came across this function from get_map_row_nomask(...) in the same 
file (which came all the way from Rast_get_row(fds[i], rowbuf, row, 
maptype);), and the result of compute_window_row(...) is used to 
determined whether to "read cell file row if not in memory".

I'm a little confused about the concept 'window' here, and how does that 
relate to "in memory"?

The "window", also referred to as the "region", defines a grid in
cartographic space. It consists of a rectangle defined by the north,
south, east and west bounds, and the number of rows and columns.

All input raster maps are resampled, cropped and/or padded with nulls
according to the window. This allows a module to read multiple raster
maps which may have differing bounds and resolutions without having to
handle the details explicitly.

My understanding is that "window" is a "visible area", and rows are only 
read from disk file if it's within this "window" area. Then what would 
happen if rast_get_row is used to get the exact same row in exact the 
same map more than once? Does the second time NOT read the row from file 
because the first time has put this row in memory? In this case the 
requested row is copied from a inner buffer that was filled with the 
data during the first time call?

Correct.

The file lib/raster/R.h defines the core data structures used by the
raster library. R__.fileinfo points to an array of fileinfo
structures, indexed by the map's file descriptor (this is the actual
Unix file descriptor of the cell/fcell file).

Within the fileinfo structure, the "data" field points to the data for
the current row, after decompression but before resampling or
conversion. The "cur_row" field contains the number of the row to
which the data corresponds.

get_map_row_nomask() (which all of the higher-level "get row"
functions use) checks whether the row being requested is equal to the
row which is stored; if it is, it doesnt' bother to read the row from
disc again.

If an input map has a larger (coarser) north-south resolution than the
current region, adjacent rows within the current region will often map
to the same row in the underlying map. By caching the most recent row
in the fileinfo structure, redundant disc access is avoided.

Note that the "processed" data cannot be cached, as it may depend upon
a MASK map, which can have a finer resolution than the map itself. 
Also, it's permitted to change the window while a map is being read,
and it's also permitted to change the requested format (CELL, FCELL or
DCELL).

If the processed data was cached, it would be necessary to store
additional information about the cached data in order to determine
whether it can be re-used. Storing the "raw" data from the file
doesn't have these problems, as the only processing (the
decompression) doesn't depend upon any parameters or settings.

Peng Du wrote:

In function static int compute_window_row(int fd, int row, int *cellRow) 
in ~/grass/7.0/grass_trunk/lib/raster/get_row.c, there is a check to see 
if row is in window, and if not, 0 is returned.

I came across this function from get_map_row_nomask(...) in the same 
file (which came all the way from Rast_get_row(fds[i], rowbuf, row, 
maptype);), and the result of compute_window_row(...) is used to 
determined whether to "read cell file row if not in memory".

I'm a little confused about the concept 'window' here, and how does that 
relate to "in memory"?

The "window", also referred to as the "region", defines a grid in
cartographic space. It consists of a rectangle defined by the north,
south, east and west bounds, and the number of rows and columns.

All input raster maps are resampled, cropped and/or padded with nulls
according to the window. This allows a module to read multiple raster
maps which may have differing bounds and resolutions without having to
handle the details explicitly.

My understanding is that "window" is a "visible area", and rows are only 
read from disk file if it's within this "window" area. Then what would 
happen if rast_get_row is used to get the exact same row in exact the 
same map more than once? Does the second time NOT read the row from file 
because the first time has put this row in memory? In this case the 
requested row is copied from a inner buffer that was filled with the 
data during the first time call?

Correct.

The file lib/raster/R.h defines the core data structures used by the
raster library. R__.fileinfo points to an array of fileinfo
structures, indexed by the map's file descriptor (this is the actual
Unix file descriptor of the cell/fcell file).

Within the fileinfo structure, the "data" field points to the data for
the current row, after decompression but before resampling or
conversion. The "cur_row" field contains the number of the row to
which the data corresponds.

get_map_row_nomask() (which all of the higher-level "get row"
functions use) checks whether the row being requested is equal to the
row which is stored; if it is, it doesnt' bother to read the row from
disc again.

If an input map has a larger (coarser) north-south resolution than the
current region, adjacent rows within the current region will often map
to the same row in the underlying map. By caching the most recent row
in the fileinfo structure, redundant disc access is avoided.

Note that the "processed" data cannot be cached, as it may depend upon
a MASK map, which can have a finer resolution than the map itself. 
Also, it's permitted to change the window while a map is being read,
and it's also permitted to change the requested format (CELL, FCELL or
DCELL).

If the processed data was cached, it would be necessary to store
additional information about the cached data in order to determine
whether it can be re-used. Storing the "raw" data from the file
doesn't have these problems, as the only processing (the
decompression) doesn't depend upon any parameters or settings.

Peng Du wrote:

Another question: is there any function similar to get_map_row(...) but
gets a column of data from raster image rather than a row?

No.

Raster maps are stored row-by-row. If you want to access a map in a
significantly different order, you should first copy the map to a
format suitable for such access (either in memory or a temporary
file).

--
Glynn Clements <glynn@gclements.plus.com>