[Geoserver-users] WCS meats memory heap but continues and creates corrupted output

I tried again with the brand new Geoserver 2.8.1. GetCoverage for the whole mosaic (4 images, 12000 x 12000 pixels each) was now successful. The 1.6 GB output GeoTIFF is good. Memory consumption during the process with 32-bit jre and default java settings was no more than 360 MB. It took 1 minute and 40 seconds to get the image with curl. So far I was happy.

But there is something odd in how the service behaves. It took whole 15 minutes to serve the coverage with the very first time I made the GetCoverage reques but after that it was quite fast and I thought that everything is OK. But then I tested what happens if I send two 1.6 GB GetCoverages from two distinct curls. After 70 minutes I made a conclusion that even the process was still alive it was for sure all too slow for being useful and I killed curls. Unfortunately even after killing the curl jobs the java.exe that runs Geoserver continued to use 20% of CPU and burn my hard disk. That stopped finally perhaps after 10 minutes.

It seems to be very unsafe to publish even a pretty small image mosaic like my four-image test mosaic as WCS without setting Resource Consumption Limits. Perhaps Geoserver should set some rather small default values for the limits instead of setting no limits like it does now. And if it is really so easy to make WCS to jam by sending many concurrent requests it might be good to mention the Control flow module http://docs.geoserver.org/stable/en/user/extensions/controlflow/index.html in http://docs.geoserver.org/stable/en/user/webadmin/services/WCS.html.

-Jukka Rahkonen-

Andrea Aime wrote:

···

On Wed, Nov 11, 2015 at 12:53 PM, Rahkonen Jukka (MML) <jukka.rahkonen@…6847…> wrote:

Hi Andrea,

It is really a snapshot of 2.8, build date 2015-Oct-08.

Yep, too old, the commit where I hopefully removed the code that was making WCS load the data in memory

landed only 12 days ago:

https://github.com/geoserver/geoserver/commit/657d79ec5a02d10ae9c5c7678c0f65881ca947fd

Can you try again?

If you read the whole source data first into memory, can’t you wait and see if it really fits into memory before starting to write out the tiff file?

Am I right that the amount of free java memory and the amount of data that is read from the source data sets the limits for the WCS? With the default policy “Don’t use overviews” of if there are no overviews the maximum subset for one coverage is always the same independently of the output resolution because it is the amount of input data that sets the limit, not the size of the generated output? With 1 m resolution you can get more data out of WCS in megabytes than with 20 m resolution, but the limit in square km is constant?

Honestly, I’d modify that default to allow overviews usage by default, loading the data at native resolution

and subsampling it in memory is too taxing imho

Cheers

Andrea

==

GeoServer Professional Services from the experts! Visit

http://goo.gl/it488V for more information.

==

Ing. Andrea Aime

@geowolf

Technical Lead

GeoSolutions S.A.S.

Via Poggio alle Viti 1187

55054 Massarosa (LU)

Italy

phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

http://www.geo-solutions.it

http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


On Fri, Nov 20, 2015 at 5:12 PM, Rahkonen Jukka (MML) <
jukka.rahkonen@anonymised.com> wrote:

I tried again with the brand new Geoserver 2.8.1. GetCoverage for the
whole mosaic (4 images, 12000 x 12000 pixels each) was now successful. The
1.6 GB output GeoTIFF is good. Memory consumption during the process with
32-bit jre and default java settings was no more than 360 MB. It took 1
minute and 40 seconds to get the image with curl. So far I was happy.

But there is something odd in how the service behaves. It took whole 15
minutes to serve the coverage with the very first time I made the
GetCoverage reques but after that it was quite fast and I thought that
everything is OK. But then I tested what happens if I send two 1.6 GB
GetCoverages from two distinct curls. After 70 minutes I made a conclusion
that even the process was still alive it was for sure all too slow for
being useful and I killed curls. Unfortunately even after killing the curl
jobs the java.exe that runs Geoserver continued to use 20% of CPU and burn
my hard disk. That stopped finally perhaps after 10 minutes.

Yep, this is happening because we cannot stream the tiff out directly, we
have to write it on disk first and return it later. And the HTTP protocol
is setup in such a way that
one can notice the client is no more there only by writing bytes out to it,
so killing the client request does exactly nothing unfortunately, GeoServer
keeps on processing
until the tiff is fully written, and then discovers the client is not there
anymore when trying to write out the first byte to the response socket.

If you have a 32bit JRE it might be that you don't have enough memory
allocated to the java runtime, which will make the two requests compete for
the entries
in the tile cache, one stealing cached tiles from the other, hampering
progress.... the only thing I can recommend if this is the case, is to give
GeoServer
more heap.
It would be interested to simulate this load and see if we can do anything
to improve things on lower heap systems... but, gut feeling, it may require
a few
days of work

It seems to be very unsafe to publish even a pretty small image mosaic
like my four-image test mosaic as WCS without setting Resource Consumption
Limits. Perhaps Geoserver should set some rather small default values for
the limits instead of setting no limits like it does now. And if it is
really so easy to make WCS to jam by sending many concurrent requests it
might be good to mention the Control flow module
http://docs.geoserver.org/stable/en/user/extensions/controlflow/index.html
in http://docs.geoserver.org/stable/en/user/webadmin/services/WCS.html.

Yes, agreed, both good ideas. You might want to open a ticket for the first
one at least.

On, on the master series we have a coupe extra improvements that you might
want to check out using a nightly build:
http://ares.boundlessgeo.com/geoserver/master/

For large extractions WCS is not a suitable protocol anyways, as it does
not have a asynchronous mode, you might
want to have a look at the WPS download module instead, it's a community
module that we created with the sole purpose
of replacing WFS/WCS for large extractions where the asynch call support is
a must.

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------