Markus Neteler wrote:

I have tried to average elevations for selected zones

with r.average but wasn't quite successful.

base map: raster map of farms with category number per farm

cover map: FP DEM

Example with Spearfish 5.7:

r.info -t elevation.10m

datatype=DCELL

r.average -c base=fields cover=elevation.10m out=fields_elev_mean

WARNING: r.stats: cats for elevation.10m sre either missing or have no

explicit labels. Using nsteps=255

This message is because r.stats is being passed the -C flag. I don't

know whether or not that's the right thing to do.

r.stats: 100%

percent complete: 100%

r.average is calling r.stats to calculate the result, but probably

the module needs a significant change to work properly with

FP data?

Any other idea?

The nature of r.stats is such that it doesn't make sense to use it on

FP maps; the behaviour would typically degenerate to "r.stats -1",

i.e. each FP value which actually occurred in the map would be output

with an associated count of 1 (or an associated area equal to the area

of a cell).

Programs such as r.statistics or r.average may as well use something

like:

r.stats -1 | sort -n -k 1 | uniq -c | awk '{print $2, $3, $1}'

for FP maps, and scale the cell counts by the area of a cell.

Note that this will be significantly slower than using r.stats.

However, it will be more accurate, as no quantisation is occurring.

With the r.stats approach, increasing the quantisation granularity

would improve accuracy; OTOH it will also increase r.stats' memory

consumption. In the degenerate case, r.stats will end up with one

"bin" per cell, i.e. storing both maps in memory in a very inefficient

format (a 28-byte Node structure per cell in addition to the actual

cell values).

The real solution would be for r.statistics and r.average to do the

collation themselves, with an accumulator structure for each category

in the base map. For each (baseval,coverval) pair, you would update

the accumulator corresponding to the base value.

E.g. for method=average (pseudo-code):

int num_cats = base_max - base_min + 1;

struct accum {

DCELL sum;

int count;

} *accumulator = G_malloc(num_cats * sizeof(struct accum));

int i;

for (i = 0; i <= base_max - base_min; i++)

{

accumulator[i].sum = 0;

accumulator[i].count = 0;

}

for (row = 0; row < cellhd.rows; row++)

{

G_get_c_raster_value(base_fd, base_buf, row);

G_get_d_raster_value(cover_fd, cover_buf, row);

for (col = 0; col < cellhd.cols; col++)

{

CELL baseval = base_buf[col];

DCELL coverval = cover_buf[col];

int idx = baseval - base_min;

accumulator[idx].sum += coverval;

accumulator[idx].count++;

}

}

for (i = 0; i <= base_max - base_min; i++)

average[i] = accumulator[i].sum / accumulator[i].count;

This requires memory proportional to the range of base categories. I

suppose that there would be a problem if the categories were sparse,

but that would also be a problem when generating the reclass table.

The sum, standard deviation, variance etc can be computed similarly.

Computing the median is somewhat harder; you may need to use multiple

passes unless you're willing to store an entire map in memory. The

mode probably isn't meaningful for FP data.

--

Glynn Clements <glynn.clements@virgin.net>