[Geoserver-devel] GEOS-5276 - Allow fetch size on Oracle JNDI Data store - postgis implications

Hi List,

I came across this feature request when searching for information on an out of memory error we were having in GeoServer 2.4.2 connecting to postgis 2.1 using a JNDI Data Store. This problem does not appear to be related just to Oracle and I believe was causing the out of memory error we were getting with postgis. I’ve applied a patch to geotools to resolve this issue for us and I would like to work out how to get it applied to geoserver (geotools actually).

Background

We were using WFS to download features as gml/csv, and when we requested a small number of large downloads, a) it would take a while for the downloads to start and b) memory usage in geoserver would go up and eventually we would run out of memory.

When I loaded the thread dump into the Eclipse Memory Analyzer, I found that the majority of memory was being taken up by a number of jdbc result sets. When I drilled down to the statement being used to retrieve features I found that the fetch size for the statement was set to 0.

For postgres jdbc queries, a fetch size of zero results in all records being fetched to the client before processing continues, rather than fetching records in manageable batches (refer http://jdbc.postgresql.org/documentation/head/query.html).

This was a major problem for us as we’ve invested a lot of time in setting up our infrastructure around JNDI configuration of data stores.

The Patch

I spent some time working out why the fetch size parameter wasn’t available for JNDI data stores whereas it is for JDBC data stores, and managed to work out how to patch geotools so that we could continue using JNDI data store configuration.

In the end I needed to make the following changes to modules/library/jdbc/src/main/java/org/geotools/jdbc/JDBCJNDIDataStoreFactory.java in geotools:

/**

  • Override to omit all those parameters which define the creation of
  • the connection.
    */
    protected void setupParameters(Map parameters) {
    parameters.put(DBTYPE.key, new Param(DBTYPE.key, DBTYPE.type, DBTYPE.description,
    DBTYPE.required, getDatabaseID()));
    parameters.put(JNDI_REFNAME.key, JNDI_REFNAME);
    parameters.put(SCHEMA.key, SCHEMA);
    parameters.put(NAMESPACE.key, NAMESPACE);
    parameters.put(EXPOSE_PK.key, EXPOSE_PK);
  • parameters.put(FETCHSIZE.key, FETCHSIZE);
    parameters.put(PK_METADATA_TABLE.key, PK_METADATA_TABLE);
    parameters.put(SQL_ON_BORROW.key, SQL_ON_BORROW);
    parameters.put(SQL_ON_RELEASE.key, SQL_ON_RELEASE);
    }

This method overrides the method setupParameters in JDBCDataStoreFactory.java so that connection parameters not required for a JNDI connection are removed. Fetch size is not one of these parameters and so this looks like an oversight to me.

I don’t believe the test case for JDBCJNDIDataStoreFactory.java needs to be updated as it checks that all jndi parameters are also jdbc parameters and not the specifics of what they are.

No other code changes appear to be required for us. Once this line was added, fetch size appeared for the data store in the Geoserver gui, was persisted correctly to the data store configuration file and is used on statements. Our wfs downloads for large data files start straight away where postgres can deliver them more or less immediately and run using a fraction of the memory (just that required to hold each batch of records in memory. We have been using it in production for a number of weeks now.

How do I go about getting this fix (for us its not a feature) applied to geotools/geoserver?

Thanks,
Craig Jones
Integrated Marine Observing System

Hi List,

I came across this feature request when searching for information on an out of memory error we were having in GeoServer 2.4.2 connecting to postgis 2.1 using a JNDI Data Store. This problem does not appear to be related just to Oracle and I believe was causing the out of memory error we were getting with postgis. I've applied a patch to geotools to resolve this issue for us and I would like to work out how to get it applied to geoserver (geotools actually).
*
**Background**
*
We were using WFS to download features as gml/csv, and when we requested a small number of large downloads, a) it would take a while for the downloads to start and b) memory usage in geoserver would go up and eventually we would run out of memory.

When I loaded the heap dump into the Eclipse Memory Analyzer, I found that the majority of memory was being taken up by a small number of jdbc result sets. When I drilled down to the statement being used to retrieve features I found that the fetch size for the statement was set to 0.

For postgres jdbc queries, a fetch size of zero results in all records being fetched to the client before processing continues, rather than fetching records in manageable batches (refer http://jdbc.postgresql.org/documentation/head/query.html).

This was a major problem for us as we've invested a lot of time in setting up our infrastructure around JNDI configuration of data stores.

*The Patch*

I spent some time working out why the fetch size parameter wasn't available for JNDI data stores whereas it is for JDBC data stores, and managed to work out how to patch geotools so that we could continue using JNDI data store configuration.

In the end I needed to make the following changes to modules/library/jdbc/src/main/java/org/geotools/jdbc/JDBCJNDIDataStoreFactory.java in geotools:

      /**
       * Override to omit all those parameters which define the creation of
       * the connection.
       */
      protected void setupParameters(Map parameters) {
          parameters.put(DBTYPE.key, new Param(DBTYPE.key, DBTYPE.type, DBTYPE.description,
                  DBTYPE.required, getDatabaseID()));
          parameters.put(JNDI_REFNAME.key, JNDI_REFNAME);
          parameters.put(SCHEMA.key, SCHEMA);
          parameters.put(NAMESPACE.key, NAMESPACE);
          parameters.put(EXPOSE_PK.key, EXPOSE_PK);
+ parameters.put(FETCHSIZE.key, FETCHSIZE);
          parameters.put(PK_METADATA_TABLE.key, PK_METADATA_TABLE);
          parameters.put(SQL_ON_BORROW.key, SQL_ON_BORROW);
          parameters.put(SQL_ON_RELEASE.key, SQL_ON_RELEASE);
      }

This method overrides the method setupParameters in JDBCDataStoreFactory.java so that connection parameters not required for a JNDI connection are removed. Fetch size is not one of these parameters and so this looks like an oversight to me.

I don't believe the test case for JDBCJNDIDataStoreFactory.java needs to be updated as it checks that all jndi parameters are also jdbc parameters and not the specifics of what they are.

No other code changes appear to be required for us. Once this line was added, fetch size appeared for the data store in the Geoserver gui, was persisted correctly to the data store configuration file and is used on statements. Our wfs downloads for large data files start straight away where postgres can deliver them more or less immediately and run using a fraction of the memory (just that required to hold each batch of records in memory. We have been using it in production for a number of weeks now.

How do I go about getting this fix (for us its not a feature) applied to geotools/geoserver?

Thanks,
Craig Jones
Integrated Marine Observing System

Always nice to have more Australians in the mix :slight_smile:

You can submit a pull request to geotools. As long as the change is small (i.e. to a single file) then we should be good to go. If it covers several files you will be asked to send in a code contribution agreement.

There are a few links on the GeoTools Get Involved page covering pull request and contributing.

···

Jody Garnett

On Mon, Feb 17, 2014 at 3:59 PM, Craig Jones <Craig.Jones@anonymised.com> wrote:

Hi List,

I came across this feature request when searching for information on an out of memory error we were having in GeoServer 2.4.2 connecting to postgis 2.1 using a JNDI Data Store. This problem does not appear to be related just to Oracle and I believe was causing the out of memory error we were getting with postgis. I’ve applied a patch to geotools to resolve this issue for us and I would like to work out how to get it applied to geoserver (geotools actually).

Background

We were using WFS to download features as gml/csv, and when we requested a small number of large downloads, a) it would take a while for the downloads to start and b) memory usage in geoserver would go up and eventually we would run out of memory.

When I loaded the thread dump into the Eclipse Memory Analyzer, I found that the majority of memory was being taken up by a number of jdbc result sets. When I drilled down to the statement being used to retrieve features I found that the fetch size for the statement was set to 0.

For postgres jdbc queries, a fetch size of zero results in all records being fetched to the client before processing continues, rather than fetching records in manageable batches (refer http://jdbc.postgresql.org/documentation/head/query.html).

This was a major problem for us as we’ve invested a lot of time in setting up our infrastructure around JNDI configuration of data stores.

The Patch

I spent some time working out why the fetch size parameter wasn’t available for JNDI data stores whereas it is for JDBC data stores, and managed to work out how to patch geotools so that we could continue using JNDI data store configuration.

In the end I needed to make the following changes to modules/library/jdbc/src/main/java/org/geotools/jdbc/JDBCJNDIDataStoreFactory.java in geotools:

/**

  • Override to omit all those parameters which define the creation of
  • the connection.
    */
    protected void setupParameters(Map parameters) {
    parameters.put(DBTYPE.key, new Param(DBTYPE.key, DBTYPE.type, DBTYPE.description,
    DBTYPE.required, getDatabaseID()));
    parameters.put(JNDI_REFNAME.key, JNDI_REFNAME);
    parameters.put(SCHEMA.key, SCHEMA);
    parameters.put(NAMESPACE.key, NAMESPACE);
    parameters.put(EXPOSE_PK.key, EXPOSE_PK);
  • parameters.put(FETCHSIZE.key, FETCHSIZE);
    parameters.put(PK_METADATA_TABLE.key, PK_METADATA_TABLE);
    parameters.put(SQL_ON_BORROW.key, SQL_ON_BORROW);
    parameters.put(SQL_ON_RELEASE.key, SQL_ON_RELEASE);
    }

This method overrides the method setupParameters in JDBCDataStoreFactory.java so that connection parameters not required for a JNDI connection are removed. Fetch size is not one of these parameters and so this looks like an oversight to me.

I don’t believe the test case for JDBCJNDIDataStoreFactory.java needs to be updated as it checks that all jndi parameters are also jdbc parameters and not the specifics of what they are.

No other code changes appear to be required for us. Once this line was added, fetch size appeared for the data store in the Geoserver gui, was persisted correctly to the data store configuration file and is used on statements. Our wfs downloads for large data files start straight away where postgres can deliver them more or less immediately and run using a fraction of the memory (just that required to hold each batch of records in memory. We have been using it in production for a number of weeks now.

How do I go about getting this fix (for us its not a feature) applied to geotools/geoserver?

Thanks,
Craig Jones
Integrated Marine Observing System


Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works.
Faster operations. Version large binaries. Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On Tue, Mar 4, 2014 at 12:16 PM, Jody Garnett <jody.garnett@anonymised.com>wrote:

Always nice to have more Australians in the mix :slight_smile:

You can submit a pull request to geotools. As long as the change is small
(i.e. to a single file) then we should be good to go. If it covers several
files you will be asked to send in a code contribution agreement.

There are a few links on the GeoTools Get Involved<http://geotools.org/getinvolved.html&gt;page covering pull request and contributing.

I believe in the meantime they already did a pull request, which I've
reviewed and merged, and it's now part of the 11-RC2 release

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------