[Geoserver-users] Out of memory error downloading large wfs response

Hi All,

We are getting the exception below when downloading a large wfs response in csv or gml format using gzip compression. Were are using geoserver 2.1.1 in production but this also happens in 2.3.3.

java.lang.OutOfMemoryError: Java heap space
	java.util.Arrays.copyOf(Arrays.java:2271)
	java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
	java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
	java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
	java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
	java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
	java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146)
	org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:86)
	org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:79)
	org.geoserver.filters.AlternativesResponseStream.write(AlternativesResponseStream.java:53)
	org.geoserver.wfs.response.WfsExceptionHandler.handle1_0(WfsExceptionHandler.java:132)
	org.geoserver.wfs.response.WfsExceptionHandler.handleDefault(WfsExceptionHandler.java:81)
	org.geoserver.wfs.response.WfsExceptionHandler.handleServiceException(WfsExceptionHandler.java:59)
	org.geoserver.ows.Dispatcher.handleServiceException(Dispatcher.java:1638)
	org.geoserver.ows.Dispatcher.exception(Dispatcher.java:1583)
	org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:282)
	org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
	org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)

From the log message above, geoserver is falling over writing all the gzipped output to a ByteArrayOutputStream which in this case filled up all available memory.

I note that this was the subject of a previous issue GEOS-2626, but it ooks to me like the changes made there were reverted as a part of the changes made in GEOS-4845 r16568.

Was this an intentional change or is there a way of reducing the memory footprint of these requests? We’d prefer a low memory footprint for downloads using gzip compression if possible.

Thanks,
Craig Jones
Integrated Marine Observing System

On Tue, Jul 9, 2013 at 6:27 AM, Craig Jones <Craig.Jones@anonymised.com> wrote:

Hi All,

We are getting the exception below when downloading a large wfs response
in csv or gml format using gzip compression. Were are using geoserver
2.1.1 in production but this also happens in 2.3.3.

java.lang.OutOfMemoryError: Java heap space
  java.util.Arrays.copyOf(Arrays.java:2271)
  java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
  java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
  java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
  java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
  java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
  java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146)
  org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:86)
  org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:79)
  org.geoserver.filters.AlternativesResponseStream.write(AlternativesResponseStream.java:53)
  org.geoserver.wfs.response.WfsExceptionHandler.handle1_0(WfsExceptionHandler.java:132)
  org.geoserver.wfs.response.WfsExceptionHandler.handleDefault(WfsExceptionHandler.java:81)
  org.geoserver.wfs.response.WfsExceptionHandler.handleServiceException(WfsExceptionHandler.java:59)
  org.geoserver.ows.Dispatcher.handleServiceException(Dispatcher.java:1638)
  org.geoserver.ows.Dispatcher.exception(Dispatcher.java:1583)
  org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:282)
  org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
  org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)

From the log message above, geoserver is falling over writing all the
gzipped output to a ByteArrayOutputStream which in this case filled up all
available memory.

This is a serious issue, because it goes against the basic design
principles of GeoServer, where each data access is performed
in a streaming way, read a bit of input, encode a bit out output, repeat.
That said, can you describe a bit more your scenario? There is one data
source that unfortunately cannot do streaming (the database is not JDBC
compliant, ignores the fetch size we set), which is mysql, is that your
case?

I note that this was the subject of a previous issue GEOS-2626, but it
ooks to me like the changes made there were reverted as a part of the
changes made in GEOS-4845 r16568.

I'm confused about the reference to GEOS-4845, it just moved the existing
files from web/app to main to make it easier building a custom version
of GeoServer without having to recur to maven war overlays?
Mind, the history in git is only 2 years deep, using "git blame" often
results in surprising or incomplete outputs

Anyways, the gzip compression filter can be disabled by manipulating the
web.xml file, have you tried removing it? Does that fix your issue?

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Hi Andrea,

We're connecting to a postgres 9.1 postgis 2.0 database.

I can download as much data as I want after removing the gzip filter (several gigabytes) .

I was looking at the revision history in subversion which I'm assuming is correct for those earlier changes, but yes I did read that history incorrectly.

At the risk of getting it wrong again, the changes made for GEOS-2626 were made in trunk in r11367 - see attached r11367 diff

    M /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseStream.java
    M /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseWrapper.java

Prior to these changes, the response was gzipped to a ByteArrayOutputStream and when that was finished the result was copied to the servlet output stream (i.e. the gzipped response was stored in memory). After the change the response is gzipped directly to the servlet output stream.

As far as I can tell, the current code has reverted to gzipping to a ByteArrayOutputStream and then copying to the servlet output stream. Its visible in the stack trace below.

Looking through the old repository logs it looks like the filters in /trunk/src/web/src/main/java/org/geoserver/filters were deleted in r11466. The current filter appears to be sourced from src/main/src/main/java/org/geoserver/filters/GZIPResponseStream.java. This version of the filter has been around since before the changes above were made but the streaming changes above were never applied to it.

Regards,
CraigJ

On 09/07/13 22:19, Andrea Aime wrote:

On Tue, Jul 9, 2013 at 6:27 AM, Craig Jones <Craig.Jones@anonymised.com <mailto:Craig.Jones@anonymised.com>> wrote:

    Hi All,

    We are getting the exception below when downloading a large wfs
    response in csv or gml format using gzip compression. Were are
    using geoserver 2.1.1 in production but this also happens in 2.3.3.

    java.lang.OutOfMemoryError: Java heap space
      java.util.Arrays.copyOf(Arrays.java:2271)
      java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
      java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
      java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
      java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
      java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
      java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146)
      org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:86)
      org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:79)
      org.geoserver.filters.AlternativesResponseStream.write(AlternativesResponseStream.java:53)
      org.geoserver.wfs.response.WfsExceptionHandler.handle1_0(WfsExceptionHandler.java:132)
      org.geoserver.wfs.response.WfsExceptionHandler.handleDefault(WfsExceptionHandler.java:81)
      org.geoserver.wfs.response.WfsExceptionHandler.handleServiceException(WfsExceptionHandler.java:59)
      org.geoserver.ows.Dispatcher.handleServiceException(Dispatcher.java:1638)
      org.geoserver.ows.Dispatcher.exception(Dispatcher.java:1583)
      org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:282)
      org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
      org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)

    From the log message above, geoserver is falling over writing all
    the gzipped output to a ByteArrayOutputStream which in this case
    filled up all available memory.

This is a serious issue, because it goes against the basic design principles of GeoServer, where each data access is performed
in a streaming way, read a bit of input, encode a bit out output, repeat.
That said, can you describe a bit more your scenario? There is one data source that unfortunately cannot do streaming (the database is not JDBC compliant, ignores the fetch size we set), which is mysql, is that your case?

    I note that this was the subject of a previous issue GEOS-2626,
    but it ooks to me like the changes made there were reverted as a
    part of the changes made in GEOS-4845 r16568.

I'm confused about the reference to GEOS-4845, it just moved the existing files from web/app to main to make it easier building a custom version
of GeoServer without having to recur to maven war overlays?
Mind, the history in git is only 2 years deep, using "git blame" often results in surprising or incomplete outputs

Anyways, the gzip compression filter can be disabled by manipulating the web.xml file, have you tried removing it? Does that fix your issue?

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

r11367.diff (6.56 KB)

I think an alternative would be to keep the gzip filter off and just let your servlet container do gzipping. Or the http server in front of it if you have one. See like http://viralpatel.net/blogs/enable-gzip-compression-in-tomcat/

I feel like that’s now the general practice on the web? That the application shouldn’t worry about it. And I think the container can be smarter about knowing when the client supports gzip. Though perhaps it’s harder to configure for all our output formats?

···

On Tue, Jul 9, 2013 at 10:06 PM, Craig Jones <Craig.Jones@anonymised.com> wrote:

Hi Andrea,

We’re connecting to a postgres 9.1 postgis 2.0 database.

I can download as much data as I want after removing the gzip filter (several gigabytes) .

I was looking at the revision history in subversion which I’m assuming is correct for those earlier changes, but yes I did read that history incorrectly.

At the risk of getting it wrong again, the changes made for GEOS-2626 were made in trunk in r11367 - see attached r11367 diff

M /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseStream.java
M /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseWrapper.java

Prior to these changes, the response was gzipped to a ByteArrayOutputStream and when that was finished the result was copied to the servlet output stream (i.e. the gzipped response was stored in memory). After the change the response is gzipped directly to the servlet output stream.

As far as I can tell, the current code has reverted to gzipping to a ByteArrayOutputStream and then copying to the servlet output stream. Its visible in the stack trace below.

Looking through the old repository logs it looks like the filters in /trunk/src/web/src/main/java/org/geoserver/filters were deleted in r11466. The current filter appears to be sourced from src/main/src/main/java/org/geoserver/filters/GZIPResponseStream.java. This version of the filter has been around since before the changes above were made but the streaming changes above were never applied to it.

Regards,
CraigJ

On 09/07/13 22:19, Andrea Aime wrote:

On Tue, Jul 9, 2013 at 6:27 AM, Craig Jones <Craig.Jones@anonymised.com> wrote:


See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk


Geoserver-users mailing list
Geoserver-users@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Hi All,

We are getting the exception below when downloading a large wfs response in csv or gml format using gzip compression. Were are using geoserver 2.1.1 in production but this also happens in 2.3.3.

java.lang.OutOfMemoryError: Java heap space
	java.util.Arrays.copyOf(Arrays.java:2271)
	java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
	java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
	java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
	java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
	java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
	java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146)
	org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:86)
	org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:79)
	org.geoserver.filters.AlternativesResponseStream.write(AlternativesResponseStream.java:53)
	org.geoserver.wfs.response.WfsExceptionHandler.handle1_0(WfsExceptionHandler.java:132)
	org.geoserver.wfs.response.WfsExceptionHandler.handleDefault(WfsExceptionHandler.java:81)
	org.geoserver.wfs.response.WfsExceptionHandler.handleServiceException(WfsExceptionHandler.java:59)
	org.geoserver.ows.Dispatcher.handleServiceException(Dispatcher.java:1638)
	org.geoserver.ows.Dispatcher.exception(Dispatcher.java:1583)
	org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:282)
	org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
	org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)

From the log message above, geoserver is falling over writing all the gzipped output to a ByteArrayOutputStream which in this case filled up all available memory.

This is a serious issue, because it goes against the basic design principles of GeoServer, where each data access is performed
in a streaming way, read a bit of input, encode a bit out output, repeat.
That said, can you describe a bit more your scenario? There is one data source that unfortunately cannot do streaming (the database is not JDBC compliant, ignores the fetch size we set), which is mysql, is that your case?

I note that this was the subject of a previous issue GEOS-2626, but it ooks to me like the changes made there were reverted as a part of the changes made in GEOS-4845 r16568.

I’m confused about the reference to GEOS-4845, it just moved the existing files from web/app to main to make it easier building a custom version
of GeoServer without having to recur to maven war overlays?
Mind, the history in git is only 2 years deep, using “git blame” often results in surprising or incomplete outputs

Anyways, the gzip compression filter can be disabled by manipulating the web.xml file, have you tried removing it? Does that fix your issue?

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Hi Chris,

Thanks, we'll have a look at the alternatives.

CraigJ

On 10/07/13 13:02, Chris Holmes wrote:

I think an alternative would be to keep the gzip filter off and just let your servlet container do gzipping. Or the http server in front of it if you have one. See like http://viralpatel.net/blogs/enable-gzip-compression-in-tomcat/

I feel like that's now the general practice on the web? That the application shouldn't worry about it. And I think the container can be smarter about knowing when the client supports gzip. Though perhaps it's harder to configure for all our output formats?

On Tue, Jul 9, 2013 at 10:06 PM, Craig Jones <Craig.Jones@anonymised.com <mailto:Craig.Jones@anonymised.com>> wrote:

    Hi Andrea,

    We're connecting to a postgres 9.1 postgis 2.0 database.

    I can download as much data as I want after removing the gzip
    filter (several gigabytes) .

    I was looking at the revision history in subversion which I'm
    assuming is correct for those earlier changes, but yes I did read
    that history incorrectly.

    At the risk of getting it wrong again, the changes made for
    GEOS-2626 were made in trunk in r11367 - see attached r11367 diff

       M
    /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseStream.java
       M
    /trunk/src/web/src/main/java/org/geoserver/filters/GZIPResponseWrapper.java

    Prior to these changes, the response was gzipped to a
    ByteArrayOutputStream and when that was finished the result was
    copied to the servlet output stream (i.e. the gzipped response was
    stored in memory). After the change the response is gzipped
    directly to the servlet output stream.

    As far as I can tell, the current code has reverted to gzipping to
    a ByteArrayOutputStream and then copying to the servlet output
    stream. Its visible in the stack trace below.

    Looking through the old repository logs it looks like the filters
    in /trunk/src/web/src/main/java/org/geoserver/filters were deleted
    in r11466. The current filter appears to be sourced from
    src/main/src/main/java/org/geoserver/filters/GZIPResponseStream.java.
    This version of the filter has been around since before the
    changes above were made but the streaming changes above were never
    applied to it.

    Regards,
    CraigJ

    On 09/07/13 22:19, Andrea Aime wrote:

    On Tue, Jul 9, 2013 at 6:27 AM, Craig Jones
    <Craig.Jones@anonymised.com <mailto:Craig.Jones@anonymised.com>> wrote:

        Hi All,

        We are getting the exception below when downloading a large
        wfs response in csv or gml format using gzip compression. Were are using geoserver 2.1.1 in production but this also
        happens in 2.3.3.

        java.lang.OutOfMemoryError: Java heap space
          java.util.Arrays.copyOf(Arrays.java:2271)
          java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
          java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
          java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
          java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
          java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
          java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146)
          org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:86)
          org.geoserver.filters.GZIPResponseStream.write(GZIPResponseStream.java:79)
          org.geoserver.filters.AlternativesResponseStream.write(AlternativesResponseStream.java:53)
          org.geoserver.wfs.response.WfsExceptionHandler.handle1_0(WfsExceptionHandler.java:132)
          org.geoserver.wfs.response.WfsExceptionHandler.handleDefault(WfsExceptionHandler.java:81)
          org.geoserver.wfs.response.WfsExceptionHandler.handleServiceException(WfsExceptionHandler.java:59)
          org.geoserver.ows.Dispatcher.handleServiceException(Dispatcher.java:1638)
          org.geoserver.ows.Dispatcher.exception(Dispatcher.java:1583)
          org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:282)
          org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
          org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)

        From the log message above, geoserver is falling over writing
        all the gzipped output to a ByteArrayOutputStream which in
        this case filled up all available memory.

    This is a serious issue, because it goes against the basic design
    principles of GeoServer, where each data access is performed
    in a streaming way, read a bit of input, encode a bit out output,
    repeat.
    That said, can you describe a bit more your scenario? There is
    one data source that unfortunately cannot do streaming (the
    database is not JDBC compliant, ignores the fetch size we set),
    which is mysql, is that your case?

        I note that this was the subject of a previous issue
        GEOS-2626, but it ooks to me like the changes made there were
        reverted as a part of the changes made in GEOS-4845 r16568.

    I'm confused about the reference to GEOS-4845, it just moved the
    existing files from web/app to main to make it easier building a
    custom version
    of GeoServer without having to recur to maven war overlays?
    Mind, the history in git is only 2 years deep, using "git blame"
    often results in surprising or incomplete outputs

    Anyways, the gzip compression filter can be disabled by
    manipulating the web.xml file, have you tried removing it? Does
    that fix your issue?

    Cheers
    Andrea

    -- ==
    Our support, Your Success! Visit http://opensdi.geo-solutions.it
    for more information.
    ==

    Ing. Andrea Aime
    @geowolf
    Technical Lead

    GeoSolutions S.A.S.
    Via Poggio alle Viti 1187
    55054 Massarosa (LU)
    Italy
    phone: +39 0584 962313 <tel:%2B39%200584%20962313>
    fax: +39 0584 1660272 <tel:%2B39%200584%201660272>
    mob: +39 339 8844549 <tel:%2B39%20%C2%A0339%208844549>

    http://www.geo-solutions.it
    http://twitter.com/geosolutions_it

    -------------------------------------------------------

    ------------------------------------------------------------------------------
    See everything from the browser to the database with AppDynamics
    Get end-to-end visibility with application monitoring from AppDynamics
    Isolate bottlenecks and diagnose root cause in seconds.
    Start your free trial of AppDynamics Pro today!
    http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
    _______________________________________________
    Geoserver-users mailing list
    Geoserver-users@lists.sourceforge.net
    <mailto:Geoserver-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/geoserver-users