[GRASS-dev] [GRASS GIS] #359: r.quantile computes incorrect values with default number of bins

#359: r.quantile computes incorrect values with default number of bins
------------------------+---------------------------------------------------
Reporter: dylan | Owner: grass-dev@lists.osgeo.org
     Type: defect | Status: new
Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Keywords: r.quantile | Platform: Linux
      Cpu: x86-32 |
------------------------+---------------------------------------------------
I noticed that when computing quantiles from a highly skewed distribution,
incorrect values are returned with the default number of bins. Reducing
the number of bins seems to correct the problem

Example, based on attached raster file in GRASS ASCII format:

{{{
r.quantile in=beam_150 quantiles=5

0:20.000000:7440.040527
1:40.000000:7512.872559
2:60.000000:7611.160645
3:80.000000:7611.161133

r.quantile in=beam_150 quantiles=5 bins=10')

0:20.000000:7440.040527
1:40.000000:7512.872559
2:60.000000:7570.587402
3:80.000000:7611.161133
}}}

Quantiles computed in R:
{{{
library(spgrass6)

x <- readRAST6('beam_150')

quantile(x@data$beam_150, prob=c(0,0.2,0.4,0.6,0.8,1), na.rm=TRUE)

       0% 20% 40% 60% 80% 100%
6429.886 7440.040 7512.871 7570.580 7611.161 7638.599
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/359&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
----------------------+-----------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
----------------------+-----------------------------------------------------
Comment (by dylan):

Oops. Looks like the attachment is too big. It can be accessed here:
http://169.237.35.250/~dylan/temp/beam_150.gz

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/359#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
----------------------+-----------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
----------------------+-----------------------------------------------------
Comment (by glynn):

Replying to [ticket:359 dylan]:
> I noticed that when computing quantiles from a highly skewed
distribution, incorrect values are returned with the default number of
bins. Reducing the number of bins seems to correct the problem
>
> Example, based on attached raster file in GRASS ASCII format:

[snip]

I cannot reproduce this. I always get the second set of values regardless
of the number of bins used:

{{{
$ for n in 1 2 3 4 5 6 7 8 9 10 100 1000 10000 100000 1000000 ; do
r.quantile beam_150 quantiles=5 bins=$n > quantile.$n.txt ; done
$ md5sum quantile.*.txt
2486fb3ae6eca4988acf21de10636a45 quantile.1.txt
2486fb3ae6eca4988acf21de10636a45 quantile.10.txt
2486fb3ae6eca4988acf21de10636a45 quantile.100.txt
2486fb3ae6eca4988acf21de10636a45 quantile.1000.txt
2486fb3ae6eca4988acf21de10636a45 quantile.10000.txt
2486fb3ae6eca4988acf21de10636a45 quantile.100000.txt
2486fb3ae6eca4988acf21de10636a45 quantile.1000000.txt
2486fb3ae6eca4988acf21de10636a45 quantile.2.txt
2486fb3ae6eca4988acf21de10636a45 quantile.3.txt
2486fb3ae6eca4988acf21de10636a45 quantile.4.txt
2486fb3ae6eca4988acf21de10636a45 quantile.5.txt
2486fb3ae6eca4988acf21de10636a45 quantile.6.txt
2486fb3ae6eca4988acf21de10636a45 quantile.7.txt
2486fb3ae6eca4988acf21de10636a45 quantile.8.txt
2486fb3ae6eca4988acf21de10636a45 quantile.9.txt
$ cat quantile.1.txt
0:20.000000:7440.040527
1:40.000000:7512.872559
2:60.000000:7570.587402
3:80.000000:7611.161133
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/359#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
----------------------+-----------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
----------------------+-----------------------------------------------------
Comment (by neteler):

Dylan, can we close this?

Markus

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/359#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

On Sunday 25 January 2009, GRASS GIS wrote:

#359: r.quantile computes incorrect values with default number of bins
----------------------+----------------------------------------------------
- Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
----------------------+----------------------------------------------------
- Comment (by neteler):

Dylan, can we close this?

Markus

Hi Markus. I am getting the latest revision of 64 to check this out today.

Dylan

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

#359: r.quantile computes incorrect values with default number of bins
-------------------------+--------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: closed
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: worksforme | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
-------------------------+--------------------------------------------------
Changes (by neteler):

  * status: new => closed
  * resolution: => worksforme

Comment:

No answer, seems to be unreproducible. Feel free to reopen.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/359#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
-------------------------+--------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: closed
  Priority: major | Milestone: 6.4.0
Component: default | Version: svn-develbranch6
Resolution: worksforme | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
-------------------------+--------------------------------------------------
Comment (by hamish):

I'd be very cautious about ignoring this one and assuming that it
magically healed itself. Silent data corruption is rather nasty when you
base other work on it.

The URL test dataset is no longer active, so I can't help test.

I wonder if "g.region rast=inputmap" might help though.

Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/359#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
-------------------------+--------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: closed
  Priority: major | Milestone: 6.5.0
Component: default | Version: svn-develbranch6
Resolution: worksforme | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
-------------------------+--------------------------------------------------
Changes (by dylan):

  * milestone: 6.4.0 => 6.5.0

Comment:

Some updates: Looks like things are working now... with a similar dataset
I get the following:

in GRASS:
{{{
r.quantile in=rtk_hybrid_beam_0150 quantiles=5
Computing histogram
  100%
Computing bins
Binning data
  100%
Sorting bins
  100%
Computing quantiles
0:20.000000:7096.507812
1:40.000000:7291.034180
2:60.000000:7412.145020
3:80.000000:7514.283691
}}}

in R:
{{{
library(spgrass6)
x <- readRAST6('rtk_hybrid_beam_0150')

quantile(x@data, prob=c(0.2,0.4,0.6,0.8), na.rm=TRUE)
      20% 40% 60% 80%
7096.507 7291.034 7412.144 7514.284
}}}

Seems close enough for me. The test file is attached.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/359#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>

#359: r.quantile computes incorrect values with default number of bins
-------------------------+--------------------------------------------------
  Reporter: dylan | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: closed
  Priority: major | Milestone: 6.5.0
Component: default | Version: svn-develbranch6
Resolution: worksforme | Keywords: r.quantile
  Platform: Linux | Cpu: x86-32
-------------------------+--------------------------------------------------
Comment (by dylan):

Example referenced above can be found here:
http://169.237.35.250/~dylan/temp/rtk_hybrid_beam_0150.asc.gz

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/359#comment:7&gt;
GRASS GIS <http://grass.osgeo.org>