[GRASS5] r.average, r.statistics

Dear GRASS developers,

I have two maps: 'landuse' (CELL) and 'slope' (FCELL), and I need to average
'slope' for every category in 'landuse'. Seems like r.average would do it, so
I do:

GRASS:~ > r.average base=landuse cover=slope output=landuse.slope

But:

1) r.average help says:

"The output map is actually a reclass of the base map (see r.reclass), and
will have exactly the same category values as the base map. The averaged
values computed by r.average are stored in the output map's category labels
file."

But this is not what I see from r.info, r.report, r.cats! For example:

GRASS:~ > r.report landuse.slope
...
|----------------------------------------------------------------------------|
| Category Information |
| #|description |
|----------------------------------------------------------------------------|
| 0.2-0.266667|from to |
| 0.466667-0.533333|from to |
| 0.533333-0.6|from to |
| 0.6-0.666667|from to |
| 0.666667-0.733333|from to |
...

This looks totally different from what help promices:
   - averaged values were put in category values, not labels
   - category labels are meaningless

From this I see no way to determine the relationship I need, i.e. something
like:
  landuse category 1 --> average slope
  ...
  landuse category N --> average slope

Note that the average slope values need to be actual, not classified, like
r.report gives...

How do I go about it?

Also:

2) r.statistics is designed to do the same as r.average, r.mode, r.median,
etc. Why not deprecate r.average & others in favor of r.statistics? This
would reduce the number of modules, and I think that GRASS really needs this.

The long list of raster modules is confusing, especially given that too many
of them either duplicate each other, or don't work the way they are supposed
to (according to help). Well, you can see I've had a tough day with GRASS
today :slight_smile:

Is r.statistics supported though? I tried using it for the problem above, it
complained that r.stats doesn't accept <z> flag. Changed line 29 of
src/raster/r.statistics/cmd/o_average.c
from
  sprintf (command, "%s -az input='%s,%s' fs=space > %s", ...
to:
  sprintf (command, "%s -an input='%s,%s' fs=space > %s", ...

Now I can run it:
GRASS:~ > r.statistics base=landuse cover=slope method=average
output=landuse.slope1

but the output map 'landuse.slope1' is totally wrong. r.report says that all
it's cells are null, and r.info seems to get stuck in a loop and needs Ctrl-C.

Thanks,

Aleksey

Aleksey Naumov wrote:

I have two maps: 'landuse' (CELL) and 'slope' (FCELL), and I need to average
'slope' for every category in 'landuse'. Seems like r.average would do it, so
I do:

GRASS:~ > r.average base=landuse cover=slope output=landuse.slope

But:

1) r.average help says:

"The output map is actually a reclass of the base map (see r.reclass), and
will have exactly the same category values as the base map. The averaged
values computed by r.average are stored in the output map's category labels
file."

But this is not what I see from r.info, r.report, r.cats! For example:

It appears that r.average was changed to use r.recode instead of
r.reclass when floating-point support was added. AFAIK, reclass maps
can only have integer values.

r.statistics *should* do it, but you note below that it doesn't.

You can extract the landuse/average-slope table from the computed
landuse.slope map with:

r.stats -1gn landuse | awk '{print $1,$2}' | r.what input=landuse,landuse.slope | awk -F'|' '{print $4,$5}' | uniq | sort | uniq

This is significantly less efficient (and more cumbersome) than the
original r.average.

You could modify r.average so that it doesn't delete the temporary
file ("tempfile2" variable) which contains the r.recode rules.

Also:

2) r.statistics is designed to do the same as r.average, r.mode, r.median,
etc. Why not deprecate r.average & others in favor of r.statistics? This
would reduce the number of modules, and I think that GRASS really needs this.

The long list of raster modules is confusing, especially given that too many
of them either duplicate each other, or don't work the way they are supposed
to (according to help). Well, you can see I've had a tough day with GRASS
today :slight_smile:

Is r.statistics supported though? I tried using it for the problem above, it
complained that r.stats doesn't accept <z> flag. Changed line 29 of
src/raster/r.statistics/cmd/o_average.c
from
  sprintf (command, "%s -az input='%s,%s' fs=space > %s", ...
to:
  sprintf (command, "%s -an input='%s,%s' fs=space > %s", ...

Now I can run it:
GRASS:~ > r.statistics base=landuse cover=slope method=average output=landuse.slope1

but the output map 'landuse.slope1' is totally wrong. r.report says that all
it's cells are null, and r.info seems to get stuck in a loop and needs Ctrl-C.

If I try this using the landuse and slope maps from the "spearfish"
dataset, it works. r.info doesn't produce useful information, but
r.cats does.

I'll commit the "-z/-n" fix to r.statistics.

--
Glynn Clements <glynn.clements@virgin.net>

Glynn,

You can extract the landuse/average-slope table from the computed
landuse.slope map with:

r.stats -1gn landuse | awk '{print $1,$2}' | r.what
input=landuse,landuse.slope | awk -F'|' '{print $4,$5}' | uniq | sort |
uniq

This is significantly less efficient (and more cumbersome) than the
original r.average.

Thanks for your help! This is the scariest-looking piece of GRASS/UNIX magic
I ever used, but it does the job (aside from the fact that it sorts landuse
values as strings, not numerically). I'll have to remember this for when I
want to impress somebody :slight_smile: Seriously, though, this is something to fall
back on, but I'd rather have r.statistics work properly (see below).

> Is r.statistics supported though? I tried using it for the problem above,
> it complained that r.stats doesn't accept <z> flag. Changed line 29 of
> src/raster/r.statistics/cmd/o_average.c
> from
> sprintf (command, "%s -az input='%s,%s' fs=space > %s", ...
> to:
> sprintf (command, "%s -an input='%s,%s' fs=space > %s", ...
>
> Now I can run it:
> GRASS:~ > r.statistics base=landuse cover=slope method=average
> output=landuse.slope1
>
> but the output map 'landuse.slope1' is totally wrong. r.report says that
> all it's cells are null, and r.info seems to get stuck in a loop and
> needs Ctrl-C.

If I try this using the landuse and slope maps from the "spearfish"
dataset, it works. r.info doesn't produce useful information, but
r.cats does.

Yes, I just found out that r.statistics works fine when cover is CELL, but it
generates a null map when cover is a FCELL map. Try this in spearfish:

r.slope.aspect elevation=elevation.dem slope=slope.f format=percent
r.statistics base=landuse cover=slope.f method=average output=landuse.slope

Then landuse.slope is all null as shown by r.report.

Obviously, it's a bug in r.statistics. I'll try to look at it, but would
appreciate any hints...

I'll commit the "-z/-n" fix to r.statistics.

Thanks.

Aleksey

Aleksey Naumov wrote:

> > Is r.statistics supported though? I tried using it for the problem above,
> > it complained that r.stats doesn't accept <z> flag. Changed line 29 of
> > src/raster/r.statistics/cmd/o_average.c
> > from
> > sprintf (command, "%s -az input='%s,%s' fs=space > %s", ...
> > to:
> > sprintf (command, "%s -an input='%s,%s' fs=space > %s", ...
> >
> > Now I can run it:
> > GRASS:~ > r.statistics base=landuse cover=slope method=average
> > output=landuse.slope1
> >
> > but the output map 'landuse.slope1' is totally wrong. r.report says that
> > all it's cells are null, and r.info seems to get stuck in a loop and
> > needs Ctrl-C.
>
> If I try this using the landuse and slope maps from the "spearfish"
> dataset, it works. r.info doesn't produce useful information, but
> r.cats does.

Yes, I just found out that r.statistics works fine when cover is CELL, but it
generates a null map when cover is a FCELL map. Try this in spearfish:

r.slope.aspect elevation=elevation.dem slope=slope.f format=percent
r.statistics base=landuse cover=slope.f method=average output=landuse.slope

Then landuse.slope is all null as shown by r.report.

Obviously, it's a bug in r.statistics. I'll try to look at it, but would
appreciate any hints...

It's basically an artifact of the way that r.statistics works.
r.statistics invokes r.stats with the base and cover maps, then
processes the output using the fact that r.stats' output is sorted in
order of the categories from the first map (the base map).

r.stats essentially works on categories. With an FP map, it partitions
the map's range into bands, and treats each band as a category. The
output format shows the band's range instead of an individual value,
and this confuses r.statistics.

To get FP maps to work, start by changing the r.stats command from
something like:

    sprintf (command, "r.stats -an input='%s,%s' fs=space > %s",
  basemap, covermap, tempfile1);

to something like:

    sprintf (command, "r.stats -1n input='%s,%s' fs=space | sort -n | awk '{print $1,$2,1}' > %s",
  basemap, covermap, tempfile1);

You will also need to change the processing loop to use float/double
instead of integer for the second column ("covercat" variable).

--
Glynn Clements <glynn.clements@virgin.net>

Thanks again, Glynn.

I see what you're suggesting (also had some time to look at the code to see
where the problem is). However, I hesitate to do a quick hack, just to get
'r.statistics ... method=average' working for FCELL maps.
For now I'll simply convert my slope map to CELL. When I get a little time,
I'll look at r.statistics in more detail and try to modify it to work with
FCELL/DCELL.

Aleksey

It's basically an artifact of the way that r.statistics works.
r.statistics invokes r.stats with the base and cover maps, then
processes the output using the fact that r.stats' output is sorted in
order of the categories from the first map (the base map).

r.stats essentially works on categories. With an FP map, it partitions
the map's range into bands, and treats each band as a category. The
output format shows the band's range instead of an individual value,
and this confuses r.statistics.

To get FP maps to work, start by changing the r.stats command from
something like:

    sprintf (command, "r.stats -an input='%s,%s' fs=space > %s",
  basemap, covermap, tempfile1);

to something like:

    sprintf (command, "r.stats -1n input='%s,%s' fs=space | sort -n | awk
'{print $1,$2,1}' > %s", basemap, covermap, tempfile1);

You will also need to change the processing loop to use float/double
instead of integer for the second column ("covercat" variable).