[GRASS-dev] G_exp_colors()?

Hi,

I am working with MODIS Aqua imagery in HDF4 format. (level 3)
  http://oceancolor.gsfc.nasa.gov/PRODUCTS/

The Chlorophyll-a concentration data comes in log form, 0-65535.
Plotting the data with a bcyr color rules makes it look great.

I added a mini-tutorial here:
  http://grass.gdf-hannover.de/wiki/MODIS

The metadata (available through gdalinfo) gives the formula and test
points to convert to mg/m^3:
  r.mapcalc "${map}.chlor_a = 10^(($Slope * $map) + $Intercept)"

Extended r.univar results + percentile=33,67 of the chlor-a map:
n=16522204
null_cells=20802596
min=0.01
max=64.5654
range=64.5554
mean=0.263613
mean_of_abs=0.263613
stddev=1.1348
variance=1.28777
coeff_var=430.478
sum=4355475.7290696017
first_quartile=0.0717828
median=0.129749
third_quartile=0.211782
percentile_33=0.0893581
percentile_67=0.176154

ie a histogram of that starts high and exponentially drops to 0. By 2.0
there are few data cells left, but real data continues up to the max in
areas having plankton blooms.

So how to make nice color rules for that? 'r.colors -g' does the opposite
of what I want, 'r.colors -e' shows some more detail but isn't really
right. A custom one using 'r.univar precentile=,' cues shows some more
but isn't smooth and is a pain.

Any ideas? Could G_log_colors() be turned on its head to make a
G_exp_colors() function and associated 'r.colors -x' flag? Would it work?

I can convert the colors given here:
  http://oceancolor.gsfc.nasa.gov/PRODUCTS/colorbars.html
from 0-255 to min->max using the 10^(mx+b) formula, and then pass those
as rules to r.colors:
  http://trac.osgeo.org/grass/browser/grass-addons/raster/r.colors.tools/palettes
So this isn't a urgent need, but could be a useful feature.

thanks,
Hamish

Hamish wrote:

I am working with MODIS Aqua imagery in HDF4 format. (level 3)
  http://oceancolor.gsfc.nasa.gov/PRODUCTS/

The Chlorophyll-a concentration data comes in log form, 0-65535.
Plotting the data with a bcyr color rules makes it look great.

I added a mini-tutorial here:
  http://grass.gdf-hannover.de/wiki/MODIS

The metadata (available through gdalinfo) gives the formula and test
points to convert to mg/m^3:
  r.mapcalc "${map}.chlor_a = 10^(($Slope * $map) + $Intercept)"

ie a histogram of that starts high and exponentially drops to 0. By 2.0
there are few data cells left, but real data continues up to the max in
areas having plankton blooms.

So how to make nice color rules for that? 'r.colors -g' does the opposite
of what I want,

Odd. Theoretically, it should do exactly what you want.

'r.colors -e' shows some more detail but isn't really
right. A custom one using 'r.univar precentile=,' cues shows some more
but isn't smooth and is a pain.

Any ideas? Could G_log_colors() be turned on its head to make a
G_exp_colors() function and associated 'r.colors -x' flag? Would it work?

It could be done easily enough, although I'm not sure how useful it
would be.

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:
> I am working with MODIS Aqua imagery in HDF4 format. (level 3)

...

> I added a mini-tutorial here:
> http://grass.gdf-hannover.de/wiki/MODIS

...

> The Chlorophyll-a concentration data comes in log form, 0-65535.

...

> The metadata (available through gdalinfo) gives the formula and test
> points to convert to mg/m^3:
> r.mapcalc "${map}.chlor_a = 10^(($Slope * $map) + $Intercept)"

> ie a histogram of that starts high and exponentially drops to 0. By
> 2.0 there are few data cells left, but real data continues up to the
> max in areas having plankton blooms.

...

> So how to make nice color rules for that? 'r.colors -g' does the
> opposite of what I want,

Glynn.

Odd. Theoretically, it should do exactly what you want.

You are right, -g is the right thing to use, it's just that the data is
/so/ logarithmic that it only brings out a little more detail and it
appears like nothing happened. (it became clear after displaying a
legend, changing the color rules, then redrawing)

Hamish

____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster
Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

Hamish wrote:

> > I am working with MODIS Aqua imagery in HDF4 format. (level 3)
...
> > I added a mini-tutorial here:
> > http://grass.gdf-hannover.de/wiki/MODIS
...
> > The Chlorophyll-a concentration data comes in log form, 0-65535.
...
> > The metadata (available through gdalinfo) gives the formula and test
> > points to convert to mg/m^3:
> > r.mapcalc "${map}.chlor_a = 10^(($Slope * $map) + $Intercept)"
>
> > ie a histogram of that starts high and exponentially drops to 0. By
> > 2.0 there are few data cells left, but real data continues up to the
> > max in areas having plankton blooms.
...
> > So how to make nice color rules for that? 'r.colors -g' does the
> > opposite of what I want,

Glynn.
> Odd. Theoretically, it should do exactly what you want.

You are right, -g is the right thing to use, it's just that the data is
/so/ logarithmic that it only brings out a little more detail and it
appears like nothing happened. (it became clear after displaying a
legend, changing the color rules, then redrawing)

Strange. I would have expected to see exactly the same results from:

  r.colors color=bcyr map=$map
and:
  r.colors -g color=bcyr map=${map}.chlor_a

AFAICT, the -g flag should effectively "undo" the exponential
conversion. All of the constant factors (e.g. $Slope and $Intercept)
cancel out from the equations.

Maybe the colour table needs more samples? r.colors is hard-coded to
use 100 samples:

  G_log_colors(&colors_tmp, &colors, 100);

Do you get better results if you increase that number?

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:
> > > I am working with MODIS Aqua imagery in HDF4 format. (level 3)
> ...
> > > I added a mini-tutorial here:
> > > http://grass.gdf-hannover.de/wiki/MODIS
> ...
> > > The Chlorophyll-a concentration data comes in log form, 0-65535.
> ...
> > > The metadata (available through gdalinfo) gives the formula and
> > > test points to convert to mg/m^3:
> > > r.mapcalc "${map}.chlor_a = 10^(($Slope * $map) + $Intercept)"
> > >
> > > ie a histogram of that starts high and exponentially drops to 0.
> > > By 2.0 there are few data cells left, but real data continues
> > > up to the max in areas having plankton blooms.
> ...
> > > So how to make nice color rules for that? 'r.colors -g' does the
> > > opposite of what I want,
>
> Glynn:
> > Odd. Theoretically, it should do exactly what you want.
Hamish:
> You are right, -g is the right thing to use, it's just that the data
> is /so/ logarithmic that it only brings out a little more detail and
> it appears like nothing happened. (it became clear after displaying
> a legend, changing the color rules, then redrawing)

Glynn:

Strange. I would have expected to see exactly the same results from:

  r.colors color=bcyr map=$map
and:
  r.colors -g color=bcyr map=${map}.chlor_a

AFAICT, the -g flag should effectively "undo" the exponential
conversion. All of the constant factors (e.g. $Slope and $Intercept)
cancel out from the equations.

Maybe the colour table needs more samples? r.colors is hard-coded to
use 100 samples:

  G_log_colors(&colors_tmp, &colors, 100);

Do you get better results if you increase that number?

No, I get the same.

I put some screenshots and r.univar info about the rasters here:
  http://bambi.otago.ac.nz/hamish/grass/bugs/r.colors/

You can see on that page the number of samples is not the problem
(reducing it to 5 gives nearly the same result), the problem is the data
is 90% done by 0.4, but the colr/ rules are only written out on integer
values so they are always scaled linearly between the 0 and 1 colr/
rules.

Hamish

      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

Hamish wrote:

the problem is the data
is 90% done by 0.4, but the colr/ rules are only written out on integer
values

Ah. Does the attached patch fix the problem?

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

color_xform.diff (400 Bytes)

Hamish wrote:
> the problem is the data is 90% done by 0.4, but the colr/ rules
> are only written out on integer values

Glynn:

Ah. Does the attached patch fix the problem?

yes, it does. thanks.

There are still slight differences (white spots),

G63> r.info -r A20023352002365.L3m_MO_CHLO_4.chlor_A
min=0.010000
max=64.565418

G63> r.colors A20023352002365.L3m_MO_CHLO_4.chlor_A col=bcyr -g
G63> cat colr/A20023352002365.L3m_MO_CHLO_4.chlor_A
% 0.010000000000000003677613769071 64.565417527653337970150460023433
nv:255
*:255
0.010000000000000003677613769071:0:0:255 0.0109169168:0:7:255
0.0109169168:0:7:255 0.0119179072:0:15:255
0.0119179072:0:15:255 0.0130106801:0:22:255
0.0130106801:0:22:255 0.0142036512:0:30:255
0.0142036512:0:30:255 0.0155060078:0:38:255
0.0155060078:0:38:255 0.0169277797:0:45:255
0.0169277797:0:45:255 0.0184799162:0:53:255
....
49.6249366001:255:23:0 54.1751303026:255:16:0
54.1751303026:255:16:0 59.1425389006:255:8:0
59.1425389006:255:8:0 64.565417527653337970150460023433:255:0:0

the lowest values are less than the first color rule so they display as
white. As do the max cell values which are slightly bigger than the color
rule number with excessive number of decimal digits.

replace r.colors/rules.c "%.25f" with %.15g or %.16g ?
http://article.gmane.org/gmane.comp.gis.grass.devel/25291

Hamish

      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

Hamish wrote:

> > the problem is the data is 90% done by 0.4, but the colr/ rules
> > are only written out on integer values

Glynn:
> Ah. Does the attached patch fix the problem?

yes, it does. thanks.

There are still slight differences (white spots),

G63> r.info -r A20023352002365.L3m_MO_CHLO_4.chlor_A
min=0.010000
max=64.565418

G63> r.colors A20023352002365.L3m_MO_CHLO_4.chlor_A col=bcyr -g
G63> cat colr/A20023352002365.L3m_MO_CHLO_4.chlor_A
% 0.010000000000000003677613769071 64.565417527653337970150460023433
nv:255
*:255
0.010000000000000003677613769071:0:0:255 0.0109169168:0:7:255
0.0109169168:0:7:255 0.0119179072:0:15:255
0.0119179072:0:15:255 0.0130106801:0:22:255
0.0130106801:0:22:255 0.0142036512:0:30:255
0.0142036512:0:30:255 0.0155060078:0:38:255
0.0155060078:0:38:255 0.0169277797:0:45:255
0.0169277797:0:45:255 0.0184799162:0:53:255
....
49.6249366001:255:23:0 54.1751303026:255:16:0
54.1751303026:255:16:0 59.1425389006:255:8:0
59.1425389006:255:8:0 64.565417527653337970150460023433:255:0:0

the lowest values are less than the first color rule so they display as
white. As do the max cell values which are slightly bigger than the color
rule number with excessive number of decimal digits.

Bear in mind that "r.info -r" is approximating the min/max values to 6
decimal places. The actual min/max will be much closer to the values
in the first line of the colr file. IEEE "double" is ~15 significant
decimal digits.

replace r.colors/rules.c "%.25f" with %.15g or %.16g ?
http://article.gmane.org/gmane.comp.gis.grass.devel/25291

That won't help at all. Excess digits in the decimal representation
won't do any harm; excess precision just gets discarded when the file
is read in.

Given that this seems to be specific to the use of G_log_colors(), I
suspect that the problem is due to slight errors in the calculation:

  lx = lmin + (lmax - lmin) * i / samples;
  x = exp(lx);

Theoretically:

i == 0:

  lx = lmin + (lmax - lmin) * 0 / samples
     = lmin + (lmax - lmin) * 0
     = lmin
  x = exp(lx)
    = exp(lmin)
    = min

i == samples:

  lx = lmin + (lmax - lmin) * samples / samples;
     = lmin + (lmax - lmin) * 1;
     = lmin + lmax - lmin;
     = lmax
  x = exp(lx)
    = exp(lmax)
    = max

However, real arithmetic and floating-point arithmetic aren't quite
the same thing, and even the slightest error will cause the tests in
the lookup code to fail.

I can modify G_log_colors() to use special cases for i==0 and
i==samples (see attached patch).

Another option would be to add an extra rule at each end, so that any
values which are slightly out of range (by how much?) just use the end
colours.

But I'm wondering if it would be better to handle this in the lookup
code, i.e. whether there should be an option for out-of-range colours
to use the colour for the corresponding extreme (min or max).

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

color_xform.2.diff (602 Bytes)

Hamish:

> There are still slight differences (white spots),
>
> G63> r.info -r A20023352002365.L3m_MO_CHLO_4.chlor_A
> min=0.010000
> max=64.565418
>
> G63> r.colors A20023352002365.L3m_MO_CHLO_4.chlor_A col=bcyr -g
> G63> cat colr/A20023352002365.L3m_MO_CHLO_4.chlor_A
> % 0.010000000000000003677613769071 64.565417527653337970150460023433
> nv:255
> *:255
> 0.010000000000000003677613769071:0:0:255 0.0109169168:0:7:255
> 0.0109169168:0:7:255 0.0119179072:0:15:255
> ....
> 54.1751303026:255:16:0 59.1425389006:255:8:0
> 59.1425389006:255:8:0 64.565417527653337970150460023433:255:0:0
>
>
> the lowest values are less than the first color rule so they display
> as white. As do the max cell values which are slightly bigger than
> the color rule number with excessive number of decimal digits.

Glynn:
Bear in mind that "r.info -r" is approximating the min/max values to 6
decimal places. The actual min/max will be much closer to the values
in the first line of the colr file. IEEE "double" is ~15 significant
decimal digits.

Ok, right. FWIW in this case min is set from (double)10^-2, so should be
as near to that as doubles can get.

Given that this seems to be specific to the use of G_log_colors(),

is it?

for the same data 'r.colors -e color=bcyr' only gets half way through the
data range:
% 0 39
nv:255
*:255
0:127:255:128 1:255:1:0
1:255:1:0 2:255:1:0
2:255:1:0 5:255:1:0
5:255:1:0 6:255:1:0
6:255:1:0 7:255:1:0
7:255:1:0 10:255:1:0
10:255:1:0 11:255:1:0
11:255:1:0 16:255:1:0
16:255:1:0 22:255:1:0
22:255:1:0 39:255:1:0

I suspect that the problem is due to slight errors in the calculation:

....

However, real arithmetic and floating-point arithmetic aren't quite
the same thing, and even the slightest error will cause the tests in
the lookup code to fail.

I can modify G_log_colors() to use special cases for i==0 and
i==samples (see attached patch).

Yes, that worked, this is the new colr/ file:
% 0.010000000000000000208166817117 64.565417527653323759295744821429
nv:255
*:255
0.010000000000000000208166817117:0:0:255 0.0109169168:0:7:255
0.0109169168:0:7:255 0.0119179072:0:15:255
0.0119179072:0:15:255 0.0130106801:0:22:255
0.0130106801:0:22:255 0.0142036512:0:30:255
....
45.4569157251:255:31:0 49.6249366001:255:23:0
49.6249366001:255:23:0 54.1751303026:255:16:0
54.1751303026:255:16:0 59.1425389006:255:8:0
59.1425389006:255:8:0 64.565417527653323759295744821429:255:0:0

or diff'd with the previous (r30909, G_add_d_..()):
- % 0.010000000000000003677613769071 64.565417527653337970150460023433
+ % 0.010000000000000000208166817117 64.565417527653323759295744821429
...
- 0.010000000000000003677613769071:0:0:255 0.0109169168:0:7:255
+ 0.010000000000000000208166817117:0:0:255 0.0109169168:0:7:255
  0.0109169168:0:7:255 0.0119179072:0:15:255
...
  54.1751303026:255:16:0 59.1425389006:255:8:0
- 59.1425389006:255:8:0 64.565417527653337970150460023433:255:0:0
+ 59.1425389006:255:8:0 64.565417527653323759295744821429:255:0:0

but now -g does bad things to the original CELL 0-65534 map:
(66535 is no-data, already reset to be NULL)

% 0 -1
nv:255
*:255
nan:0:7:255 0:0:0:255
nan:0:15:255 nan:0:7:255
nan:0:22:255 nan:0:15:255
....
nan:255:23:0 nan:255:31:0
nan:255:16:0 nan:255:23:0
nan:255:8:0 nan:255:16:0
65534:255:0:0 nan:255:8:0

should it be like
   if (i == 0)
- x = min;
+ x = exp(lmin);

etc. ?

Another option would be to add an extra rule at each end, so that any
values which are slightly out of range (by how much?) just use the end
colours.

But I'm wondering if it would be better to handle this in the lookup
code, i.e. whether there should be an option for out-of-range colours
to use the colour for the corresponding extreme (min or max).

you will see in the above colr/ file examples there is:
nv:255
*:255

nv: is the color to use for null values, *: is the color to use for out
of range. (a single number there expands to n:n:n greyscale)

if we go slightly beyond min/max we will damage the *: functionality.

Hamish

      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

Glynn Clements pisze:

Bear in mind that "r.info -r" is approximating the min/max values to 6
decimal places. The actual min/max will be much closer to the values
in the first line of the colr file. IEEE "double" is ~15 significant
decimal digits.

Could r.info -r print a different number of decimal digits, apropriate for DOUBLE and FCELL data type, respectively? Eg. 6 (or 7?) for FCELL and 15 (or 16?) for DCELL.

Maciek

Hamish wrote:

> Given that this seems to be specific to the use of G_log_colors(),

is it?

for the same data 'r.colors -e color=bcyr' only gets half way through the
data range:

That's a completely different issue. Apart from anything else, -e uses
integer categories.

> I suspect that the problem is due to slight errors in the calculation:
....
> However, real arithmetic and floating-point arithmetic aren't quite
> the same thing, and even the slightest error will cause the tests in
> the lookup code to fail.
>
> I can modify G_log_colors() to use special cases for i==0 and
> i==samples (see attached patch).

Yes, that worked, this is the new colr/ file:

or diff'd with the previous (r30909, G_add_d_..()):
- % 0.010000000000000003677613769071 64.565417527653337970150460023433
+ % 0.010000000000000000208166817117 64.565417527653323759295744821429

Right; that makes sense. The difference corresponds to the bottom few
bits of a double.

but now -g does bad things to the original CELL 0-65534 map:

You don't say :wink:

log(0) is negative infinity; applying a logarithmic colour table to a
map which extends to (or below) zero isn't going to work.

should it be like
   if (i == 0)
- x = min;
+ x = exp(lmin);

etc. ?

No; if any change is required, it's more like:

  if (min <= 0)
    G_fatal_error(...)

> Another option would be to add an extra rule at each end, so that any
> values which are slightly out of range (by how much?) just use the end
> colours.
>
> But I'm wondering if it would be better to handle this in the lookup
> code, i.e. whether there should be an option for out-of-range colours
> to use the colour for the corresponding extreme (min or max).

you will see in the above colr/ file examples there is:
nv:255
*:255

nv: is the color to use for null values, *: is the color to use for out
of range. (a single number there expands to n:n:n greyscale)

if we go slightly beyond min/max we will damage the *: functionality.

That's why I suggested it as an option.

I suspect that the default colour was intended primarily for use with
discrete categories; I'm not sure whether it makes much sense for
floating-point data.

--
Glynn Clements <glynn@gclements.plus.com>

Maciej Sieczka wrote:

> Bear in mind that "r.info -r" is approximating the min/max values to 6
> decimal places. The actual min/max will be much closer to the values
> in the first line of the colr file. IEEE "double" is ~15 significant
> decimal digits.

Could r.info -r print a different number of decimal digits, apropriate
for DOUBLE and FCELL data type, respectively? Eg. 6 (or 7?) for FCELL
and 15 (or 16?) for DCELL.

It could. Although the f_range file (which is what "r.info -r" uses)
always contains double values regardless of the underlying map type.

--
Glynn Clements <glynn@gclements.plus.com>