[GRASS-dev] i.colors.enhance: G_calloc() error in r.quantile with large maps

Hi,

I am trying to improve the color balance of a large Sentinel-2 scene.
The job fail like this:

i.colors.enhance ....
creating color enhanced composite B8A B11 B04....
Processing...
ERROR: G_calloc: unable to allocate 18446744071588213116 * 8 bytes of
memory at raster/r.quantile/main.c:355
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in
_bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 105, in get_percentile_mp
    result = get_percentile(map, percentiles)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 89, in get_percentile
    percentiles=values, quiet=True)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 461, in read_command
    return handle_errors(returncode, stdout, args, kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 329, in handle_errors
    returncode=returncode)
CalledModuleError: Module run None ['r.quantile', '--q',
'input=B11_255', 'percentiles=2,98'] ended with error
Process ended with non-zero return code 1. See errors in the (error) output.

If I'm no wrong then this is 147.57 exabytes needed :slight_smile:

The Sentinel-2 scene is 169410 * 88264 rows/cols large which does not
explain this problem.

I recall that i.colors.enhance uses the Python "multiprocessing" backend.
Assuming 4 parallel threads (not sure how many the script will take?
not documented... an nproc parameter would be better) I estimate this
RAM usage:

169410*88264 *4

[1] 478489735680 bytes

169410*88264 *4 /1014/1014/1014

[1] 57.36788 GB

which exceeds the existing 32GB RAM + 8 GB swap.

Questions:

a) is my assumption right? Still the 18446744071588213116 are weird.
The scripts runs fine for 10th of other Sentinel-2 scenes.

b) how to control the number of parallel threads in i.colors.enhance
expect for the very limiting -s flag?

thanks,
Markus

On Thu, Apr 27, 2017 at 7:31 AM, Markus Neteler <neteler@osgeo.org> wrote:

Hi,

I am trying to improve the color balance of a large Sentinel-2 scene.
The job fail like this:

i.colors.enhance ....
creating color enhanced composite B8A B11 B04....
Processing...
ERROR: G_calloc: unable to allocate 18446744071588213116 * 8 bytes of
memory at raster/r.quantile/main.c:355
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in
_bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 105, in get_percentile_mp
    result = get_percentile(map, percentiles)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 89, in get_percentile
    percentiles=values, quiet=True)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 461, in read_command
    return handle_errors(returncode, stdout, args, kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 329, in handle_errors
    returncode=returncode)
CalledModuleError: Module run None ['r.quantile', '--q',
'input=B11_255', 'percentiles=2,98'] ended with error
Process ended with non-zero return code 1. See errors in the (error) output.

If I'm no wrong then this is 147.57 exabytes needed :slight_smile:

The Sentinel-2 scene is 169410 * 88264 rows/cols large which does not
explain this problem.

I recall that i.colors.enhance uses the Python "multiprocessing" backend.
Assuming 4 parallel threads (not sure how many the script will take?
not documented... an nproc parameter would be better) I estimate this
RAM usage:

looking briefly at the source code, it's using 3 (red, green, blue)

169410*88264 *4

[1] 478489735680 bytes

169410*88264 *4 /1014/1014/1014

[1] 57.36788 GB

which exceeds the existing 32GB RAM + 8 GB swap.

Questions:

a) is my assumption right? Still the 18446744071588213116 are weird.
The scripts runs fine for 10th of other Sentinel-2 scenes.

b) how to control the number of parallel threads in i.colors.enhance
expect for the very limiting -s flag?

someone would need to rewrite it, using Pool.map_async it should be
relatively easy

Anna

thanks,
Markus
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

On Thu, Apr 27, 2017 at 1:31 PM, Markus Neteler <neteler@osgeo.org> wrote:

Hi,

I am trying to improve the color balance of a large Sentinel-2 scene.
The job fail like this:

i.colors.enhance …
creating color enhanced composite B8A B11 B04…
Processing…
ERROR: G_calloc: unable to allocate 18446744071588213116 * 8 bytes of
memory at raster/r.quantile/main.c:355

The number 18446744071588213116 is unrealistically large because it should not be larger than rows * cols

[…]

The Sentinel-2 scene is 169410 * 88264 rows/cols large which does not
explain this problem.

I would rather expect a call to G_calloc requesting a maximum of 14952804240 * 8 bytes, most likely less.

The reason for the wrong number is integer overflow because num_values is of type int. Please try trunk r70982.

Markus M

I recall that i.colors.enhance uses the Python “multiprocessing” backend.
Assuming 4 parallel threads (not sure how many the script will take?
not documented… an nproc parameter would be better) I estimate this
RAM usage:

169410*88264 4
[1] 478489735680 bytes
169410
88264 *4 /1014/1014/1014
[1] 57.36788 GB

which exceeds the existing 32GB RAM + 8 GB swap.

Questions:

a) is my assumption right? Still the 18446744071588213116 are weird.
The scripts runs fine for 10th of other Sentinel-2 scenes.

b) how to control the number of parallel threads in i.colors.enhance
expect for the very limiting -s flag?

thanks,
Markus


grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

On Fri, Apr 28, 2017 at 5:05 PM, Markus Metz
<markus.metz.giswork@gmail.com> wrote:

On Thu, Apr 27, 2017 at 1:31 PM, Markus Neteler <neteler@osgeo.org> wrote:

...

The reason for the wrong number is integer overflow because num_values is of
type int. Please try trunk r70982.

It fixes the problem!

GRASS 7.3.svn (grass):/scratch/s2scratch > g.region -p
projection: 99 (WGS 84 / Pseudo-Mercator)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 11169060
south: 10286410
west: 17960630
east: 19654740
nsres: 10
ewres: 10
rows: 88265
cols: 169411
cells: 14953061915

GRASS 7.3.svn (grass):/scratch/s2scratch > > r.quantile input=B11_255
percentiles=2,98
Computing histogram
100%
Computing bins
Binning data
100%
Sorting bins
100%
Computing quantiles
0:2.000000:0.000000
1:98.000000:83.000000

I have locally backported it, works too (I'd suggest to backport).

Thanks for the fix,
markusN

....
.On Sat, Apr 29, 2017 at 11:16 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Fri, Apr 28, 2017 at 5:05 PM, Markus Metz

...

rows: 88265
cols: 169411
cells: 14953061915

...

I have locally backported it, works too (I'd suggest to backport).

Mhh, there is still another issue:

GRASS 7.2.1svn (grass):~ > i.colors.enhance red=B04_255 green=B11_255
blue=B8A_255
Processing...
ERROR: G_calloc: unable to allocate 2181527539 * 8 bytes of memory at
       raster/r.quantile/main.c:356
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in
_bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 105, in get_percentile_mp
    result = get_percentile(map, percentiles)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 89, in get_percentile
    percentiles=values, quiet=True)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 461, in read_command
    return handle_errors(returncode, stdout, args, kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py",
line 329, in handle_errors
    returncode=returncode)
CalledModuleError: Module run None ['r.quantile', '--q',
'input=B04_255', 'percentiles=2,98'] ended with error
Process ended with non-zero return code 1. See errors in the (error) output.
Traceback (most recent call last):
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 223, in <module>
    main()
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 163, in main
    (v0, v1) = input_pipe.recv()
Process Process-3:
...

Using a new extra swapfile I have now 32GB RAM and 18GB swap available.

Next test: tracking memory consumption (using a loop):

start:
2017-04-30 11:32:10,63907664
2017-04-30 11:32:11,63906728
2017-04-30 11:32:12,63895512
...
GRASS 7.2.1svn (grass):~ > i.colors.enhance red=B04_255 green=B11_255
blue=B8A_255
Computing histogram
...
2017-04-30 11:34:44,59963812
...
2017-04-30 11:36:16,56004076
...
Binning data
...
2017-04-30 11:39:54,332164
...
2017-04-30 11:40:39,264508
...
2017-04-30 11:41:57,250920
...
2017-04-30 11:42:09,480848
...
Process Process-3:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in
_bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mundialis/software/grass72_svn/dist.x86_64-pc-linux-gnu/scripts/i.colors.enhance",
line 105, in get_percentile_mp
[.... as above...]

in the other terminal in which I had the memory check running, I got:

[mundialis@fedora-calc ~]$ for i in `seq 1 10000` ; do
memory_RAM_usage_timestamp.sh ; sleep 1 ; done >
i_colors_enhance_RAM.csv
-bash: fork: Cannot allocate memory
Connection to xxxxx.mundialis.de closed.

ops. It seems that the memory footprint becomes big at some point in
r.quantile. Could there be a leak somewhere? Or is it due to the 14,95
gigapixels?

I'll add another swap file and try again.

markusN

On Sun, Apr 30, 2017 at 11:47 AM, Markus Neteler <neteler@osgeo.org> wrote:

....
.On Sat, Apr 29, 2017 at 11:16 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Fri, Apr 28, 2017 at 5:05 PM, Markus Metz

...

rows: 88265
cols: 169411
cells: 14953061915

...

I have locally backported it, works too (I'd suggest to backport).

..

ops. It seems that the memory footprint becomes big at some point in
r.quantile. Could there be a leak somewhere? Or is it due to the 14,95
gigapixels?

I'll add another swap file and try again.

This helped!

Here the memory situation during the "Sorting bins" step which appears
to be most RAM intensive:

top - 12:22:16 up 55 days, 18 min, 5 users, load average: 3.55, 3.97, 4.09
Tasks: 678 total, 2 running, 662 sleeping, 0 stopped, 14 zombie
%Cpu(s): 14.1 us, 1.8 sy, 0.0 ni, 54.9 id, 28.2 wa, 0.5 hi, 0.5 si, 0.0 st
KiB Mem : 32772892 total, 279812 free, 32281888 used, 211192 buff/cache
KiB Swap: 39845880 total, 16045316 free, 23800564 used. 64196 avail Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27472 mundial+ 20 0 16.382g 0.013t 956 R 99.7 41.8 16:57.87 r.quantile
27474 mundial+ 20 0 17.026g 8.091g 156 R 99.7 25.9 15:43.12 r.quantile
27473 mundial+ 20 0 16.322g 9.260g 152 D 5.0 29.6 15:40.62
r.quantile <<--- !

The i.colors.enhance job was successfully completed thanks to the
increased swap file (so the fixed r.quantile does the job).

markusN

On Sun, Apr 30, 2017 at 12:51 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Sun, Apr 30, 2017 at 11:47 AM, Markus Neteler <neteler@osgeo.org> wrote:


.On Sat, Apr 29, 2017 at 11:16 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Fri, Apr 28, 2017 at 5:05 PM, Markus Metz

rows: 88265
cols: 169411
cells: 14953061915

I have locally backported it, works too (I’d suggest to backport).

ops. It seems that the memory footprint becomes big at some point in
r.quantile. Could there be a leak somewhere? Or is it due to the 14,95
gigapixels?

I’ll add another swap file and try again.

This helped!

Here the memory situation during the “Sorting bins” step which appears
to be most RAM intensive:

top - 12:22:16 up 55 days, 18 min, 5 users, load average: 3.55, 3.97, 4.09
Tasks: 678 total, 2 running, 662 sleeping, 0 stopped, 14 zombie
%Cpu(s): 14.1 us, 1.8 sy, 0.0 ni, 54.9 id, 28.2 wa, 0.5 hi, 0.5 si, 0.0 st
KiB Mem : 32772892 total, 279812 free, 32281888 used, 211192 buff/cache
KiB Swap: 39845880 total, 16045316 free, 23800564 used. 64196 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27472 mundial+ 20 0 16.382g 0.013t 956 R 99.7 41.8 16:57.87 r.quantile
27474 mundial+ 20 0 17.026g 8.091g 156 R 99.7 25.9 15:43.12 r.quantile
27473 mundial+ 20 0 16.322g 9.260g 152 D 5.0 29.6 15:40.62
r.quantile <<— !

16+GB of RAM still seems too much to me, unless the histograms of the cell values are highly skewed. What is the output of r.stats -c for B04_255, B11_255, B8A_255?

Markus M

On Sun, Apr 30, 2017 at 10:39 PM, Markus Metz
<markus.metz.giswork@gmail.com> wrote:
...

16+GB of RAM still seems too much to me, unless the histograms of the cell
values are highly skewed. What is the output of r.stats -c for B04_255,
B11_255, B8A_255?

Since this comes from our automated processing chain I could only
reimport the resulting S2 bands which should not change much.
I think they are (strongly) skewed: will send the data off-list to you.

markusN

On Mon, May 1, 2017 at 11:17 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Sun, Apr 30, 2017 at 10:39 PM, Markus Metz
<markus.metz.giswork@gmail.com> wrote:

16+GB of RAM still seems too much to me, unless the histograms of the cell
values are highly skewed. What is the output of r.stats -c for B04_255,
B11_255, B8A_255?

Since this comes from our automated processing chain I could only
reimport the resulting S2 bands which should not change much.

If NULL values are correctly set, I would assume about 4 GB of RAM needed by r.quantile.
If there are no NULL cells, I would assume about 14 GB of RAM which is close to what you observed.

Can you check your processing chain to make sure the nodata value is correctly set?

Also note that there are about 7 billion NULL cells in B04, but only 1.7 billion NULL cells in B8A and B11.

I think they are (strongly) skewed: will send the data off-list to you.

The cell count for value 1 is quite high in all three bands. Is this expected? However, the cell count for value 1 does not explain the high RAM consumption of 16+ GB.

Markus M

On Mon, May 1, 2017 at 3:27 PM, Markus Metz
<markus.metz.giswork@gmail.com> wrote:

On Mon, May 1, 2017 at 11:17 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Sun, Apr 30, 2017 at 10:39 PM, Markus Metz
...
> 16+GB of RAM still seems too much to me, unless the histograms of the
> cell
> values are highly skewed. What is the output of r.stats -c for B04_255,
> B11_255, B8A_255?

Since this comes from our automated processing chain I could only
reimport the resulting S2 bands which should not change much.

If NULL values are correctly set, I would assume about 4 GB of RAM needed by
r.quantile.
If there are no NULL cells, I would assume about 14 GB of RAM which is close
to what you observed.

Can you check your processing chain to make sure the nodata value is
correctly set?

Also note that there are about 7 billion NULL cells in B04, but only 1.7
billion NULL cells in B8A and B11.

Yeah, weird. Perhaps the atmospheric correction did not behave for all channels.

I am currently re-running all the steps from scratch and try to hold
it when reaching i.colors.enhance.
That takes several hours to get there :stuck_out_tongue:

I think they are (strongly) skewed: will send the data off-list to you.

The cell count for value 1 is quite high in all three bands. Is this
expected? However, the cell count for value 1 does not explain the high RAM
consumption of 16+ GB.

The data are processed for the Web, i.e. Geoserver. We have no clear
no-data border but cookie-stamp the true data using a mask later.
The old style Sentinel-2 data are composed of many tiles which doesn't
make things easier. Only since end of last year single tile S2 data
are produced by ESA.

We rescale the bands to 0..255 in order to reduce the massive amount
of data for the Web (we don't do that of course when producing GIS
layers).
This leads to the skewed histogram...

However, the scene is unique in being difficult, many others just went
through without troubles. And this one, too, once I had added more
swap space.

markusN