[Geoserver-users] Allow using shapefile index if create spatial index checkbox not selected [SEC=UNCLASSIFIED]

Hi List,

I've been noticing heavy IO/application lockup when serving large shapefiles with geoserver 2.2.1/geotools 8.3 under very heavy load simulated with JMeter.

If the "Create spatial index if missing/outdated" option is left enabled, the whole system gets bogged down checking if indices were up-to-date. Disabling this option speeds things up massively up but prevents the index being used altogether for WFS getfeature queries filtering by featureid, so these queries take minutes instead of seconds to return.

Tracing the checkbox through the system, I ended up in the GeoTools project in the ShapeFileDataStoreFactory.java file. I can see the checkbox value being used as a switch to enable/disable using the shapefile index:

if (createIndex) {
  store = new IndexedShapefileDataStore(url, namespace,
        useMemoryMappedBuffer, cacheMemoryMaps, true, IndexType.QIX,
    dbfCharset);
} else {
  store = new ShapefileDataStore(url, namespace,
        useMemoryMappedBuffer, cacheMemoryMaps, dbfCharset);
}

I'd really like to disable index updating in my production environment but I also want to be able to use shapefile indexes. I've developed a rough and ready git patch against geotools 8.3 to allow this (attached). The patch creates a new variable to always enable the index for local files. I could rework this to use the param class and be set in the map that gets sent to createNewDataStore() Is there any interest in this?

For GeoServer, the current checkbox label is a bit misleading, it took me a while to figure out why the indexes weren't being used. Perhaps it could be changed to say something like "enable spatial index and update if missing/outdate" or split into two options if the GeoTools code is changed to support it:

* Enable spatial index
* Create spatial index if missing/outdated

Would it possible to get this fix or something similar into GeoServer and Geotools?

Thanks,
Geoff

shapefile_index_switch.patch (1.75 KB)

Hey Geoff, thanks for taking the time to dig in to the code and figure out what’s going on. You’re our favorite type of user :wink:

Thinking about your two options, I can’t see any situation where a user would actually want shapefile indexes disabled. So I’d say your patch as it stands right now makes sense - no need to add another parameter to the datastore that lets people turn the index off. We don’t have that option in like postgis or oracle.

To get it actually in the code base, I think there’s two things that can help to get a developer to review and get it in.

First, create a pull request in github - https://help.github.com/articles/creating-a-pull-request Their system is great and afaik it’s easier for developers to review their then to pull in a patch from the list.

Second, create an issue in our jira tracker at http://jira.codehaus.org/browse/GEOS And then just in a comment there put a link to the github pull request. You could also directly attach the patch there if you’d like.

Some others may have input during the review process, but from where I sit this is a nice little improvement on the codebase. Thanks for digging in and helping!

Chris

···

On Sun, Nov 25, 2012 at 9:09 PM, Geoff Williams <G.Williams2@anonymised.com> wrote:

Hi List,

I’ve been noticing heavy IO/application lockup when serving large shapefiles with geoserver 2.2.1/geotools 8.3 under very heavy load simulated with JMeter.

If the “Create spatial index if missing/outdated” option is left enabled, the whole system gets bogged down checking if indices were up-to-date. Disabling this option speeds things up massively up but prevents the index being used altogether for WFS getfeature queries filtering by featureid, so these queries take minutes instead of seconds to return.

Tracing the checkbox through the system, I ended up in the GeoTools project in the ShapeFileDataStoreFactory.java file. I can see the checkbox value being used as a switch to enable/disable using the shapefile index:

if (createIndex) {
store = new IndexedShapefileDataStore(url, namespace,
useMemoryMappedBuffer, cacheMemoryMaps, true, IndexType.QIX,
dbfCharset);
} else {
store = new ShapefileDataStore(url, namespace,
useMemoryMappedBuffer, cacheMemoryMaps, dbfCharset);
}

I’d really like to disable index updating in my production environment but I also want to be able to use shapefile indexes. I’ve developed a rough and ready git patch against geotools 8.3 to allow this (attached). The patch creates a new variable to always enable the index for local files. I could rework this to use the param class and be set in the map that gets sent to createNewDataStore() Is there any interest in this?

For GeoServer, the current checkbox label is a bit misleading, it took me a while to figure out why the indexes weren’t being used. Perhaps it could be changed to say something like “enable spatial index and update if missing/outdate” or split into two options if the GeoTools code is changed to support it:

  • Enable spatial index
  • Create spatial index if missing/outdated

Would it possible to get this fix or something similar into GeoServer and Geotools?

Thanks,
Geoff


Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov


Geoserver-users mailing list
Geoserver-users@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

On Sun, Dec 9, 2012 at 2:07 AM, Chris Holmes <cholmes@anonymised.com> wrote:

Hey Geoff, thanks for taking the time to dig in to the code and figure out what’s going on. You’re our favorite type of user :wink:

Geoff already went to the geotools-user list, updated the patch following up feedback from Michael and me,
created a jira, and I’ve applied his patch yesterday :slight_smile:

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Oh awesome. Truly our favorite type of user. Should have checked on GeoTools list. Thanks for applying it Andrea.

···

On Sun, Dec 9, 2012 at 12:49 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Sun, Dec 9, 2012 at 2:07 AM, Chris Holmes <cholmes@anonymised.com> wrote:

Hey Geoff, thanks for taking the time to dig in to the code and figure out what’s going on. You’re our favorite type of user :wink:

Geoff already went to the geotools-user list, updated the patch following up feedback from Michael and me,
created a jira, and I’ve applied his patch yesterday :slight_smile:

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it