[Geoserver-users] Generate spatial index for shapefile

Hi,

My understanding about using shapefile as data source in GeoServer is that the first time a layer backed by shapefile is accessed through WMS or WFS, GeoServer checks if a quad-tree index (.qix) exists. If it doesn’t exist and the shapefile’s spatial extent is large enough, GeoServer will generate one. I have shapefile > 1GB. Once the .qix is generated, WMS and WFS response become extremely fast. But for the first time WMS/WFS request when the index is generated, it takes almost an hour to response. This becomes a problem each time I replace the shapefile with a new one. So my questions are:

  1. Can I configure GeoServer to use existing ESRI spatial index (.sbx) instead of .qix? Any downside of switching to .sbx?
  2. If I have to go with quad-tree index, how can I generate one offline before pushing the shapefile to GeoServer?

Thanks,
Chenglin

···
-- 
Chenglin Gan
GIS Developer
National Geodetic Survey

On Thu, Jan 26, 2012 at 4:59 PM, Chenglin Gan <chenglin.gan@…170…> wrote:

Hi,

My understanding about using shapefile as data source in GeoServer is that the first time a layer backed by shapefile is accessed through WMS or WFS, GeoServer checks if a quad-tree index (.qix) exists. If it doesn’t exist and the shapefile’s spatial extent is large enough, GeoServer will generate one. I have shapefile > 1GB. Once the .qix is generated, WMS and WFS response become extremely fast. But for the first time WMS/WFS request when the index is generated, it takes almost an hour to response. This becomes a problem each time I replace the shapefile with a new one. So my questions are:

  1. Can I configure GeoServer to use existing ESRI spatial index (.sbx) instead of .qix? Any downside of switching to .sbx?

Nope, that file format is not documented as far as I know, so we can’t make a reader.
Anyways, even if documentation popped up, someone would have to write the code to use it

  1. If I have to go with quad-tree index, how can I generate one offline before pushing the shapefile to GeoServer?

Hmm… we don’t have a stand-alone command line utility that can do the indexing.
There is a undocumented trick that can get it done, but it does not look that good.

On the command line, change directory into your geoserver/WEB-INF/lib directory
and then run the following (assuming java is in your path, and assuming you’re
going to fix the version number if gt-shapefile to match your version):

java -cp gt-shapefile-2.7.3.jar org.geotools.data.shapefile.indexed.ShapeFileIndexer /path/to/your/shapefile.shp

This will generate the .qix file offline.

If you don’t want to run from inside the geoserver WEB-INF/lib I believe you can copy the
gt-shapefile jar along with the following ones in whatever folder you prefer:
gt-data-2.7.3.jar gt-main-2.7.3.jar gt-api-2.7.3.jar jts-1.11.jar gt-referencing-2.7.3.jar vecmath-1.3.2.jar gt-metadata-2.7.3.jar gt-opengis-2.7.3.jar jsr-275-1.0-beta-2.jar

and then run the same command from there

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf


On 26 January 2012 17:31, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Thu, Jan 26, 2012 at 4:59 PM, Chenglin Gan <chenglin.gan@anonymised.com> wrote:

2. If I have to go with quad-tree index, how can I generate one offline
before pushing the shapefile to GeoServer?

Hmm... we don't have a stand-alone command line utility that can do the
indexing.
There is a undocumented trick that can get it done, but it does not look
that good.

An easier way is to open the shapefile up with UDig - it will generate
the .qix file for you the first time.

Ian
--
Ian Turton

Hi,

Is the Geotools .qix using the same format than GDAL? If it is then there are one or two more alternatives. GDAL comes often (but not always) with a Mapserver utility "shptree". And then it is possible to use GDAL shapefile driver indexing option described in Spatial and attribute indexing section at http://gdal.org/ogr/drv_shapefile.html

It is impossible to guess that index can be created with ogrinfo. This is taken from gdal-dev list.

"The SQL command to create a spatial index can be given through ogrinfo.
ogrinfo -sql "CREATE SPATIAL INDEX ON file1 [DEPTH N]" file1.shp"

-Jukka Rahkonen-

Ian Turton wrote:

On 26 January 2012 17:31, Andrea Aime
<andrea.aime@anonymised.com> wrote:
> On Thu, Jan 26, 2012 at 4:59 PM, Chenglin Gan
<chenglin.gan@anonymised.com> wrote:
>>
>> 2. If I have to go with quad-tree index, how can I
generate one offline
>> before pushing the shapefile to GeoServer?
>>
>
> Hmm... we don't have a stand-alone command line utility
that can do the
> indexing.
> There is a undocumented trick that can get it done, but it
does not look
> that good.

An easier way is to open the shapefile up with UDig - it will generate
the .qix file for you the first time.

Ian
--
Ian Turton

--------------------------------------------------------------
----------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft
developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5,
CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

On Fri, Jan 27, 2012 at 10:30 AM, Rahkonen Jukka <Jukka.Rahkonen@anonymised.com> wrote:

Hi,

Is the Geotools .qix using the same format than GDAL? If it is then there are one or two more alternatives. GDAL comes often (but not always) with a Mapserver utility “shptree”. And then it is possible to use GDAL shapefile driver indexing option described in Spatial and attribute indexing section at http://gdal.org/ogr/drv_shapefile.html

Yes, the format is the same, but the .qix files generated by GeoServer should be smaller and perform somewhat better
(we have a few heuristics to get a small but effective index, avoid isolated laves with single records and the like).
A recent version of uDig will generate the same .qix file as GeoServer though

It is impossible to guess that index can be created with ogrinfo. This is taken from gdal-dev list.

“The SQL command to create a spatial index can be given through ogrinfo.
ogrinfo -sql “CREATE SPATIAL INDEX ON file1 [DEPTH N]” file1.shp”

Nice one :slight_smile:

Cheers
Andrea


Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf


Yes, the format is the same, but the .qix files generated by GeoServer
should be smaller and perform somewhat better
(we have a few heuristics to get a small but effective index, avoid
isolated laves with single records and the like).
A recent version of uDig will generate the same .qix file as GeoServer
though

Hi Andrea,

Your remark aroused my curiosity. I have indeed identified a small addition in
the geotools port of shapelib's shptree.c that must explain that optimization
and adapted it to shptree.c. I've not verified however if the geotools .qix and
shapelib .qix are now identical, but I've seen that the new code path is
triggered in some conditions.

See http://trac.osgeo.org/gdal/ticket/4472 for the patch.

Best regards,

Even

On Fri, Jan 27, 2012 at 9:48 PM, Even Rouault <even.rouault@anonymised.com> wrote:

Yes, the format is the same, but the .qix files generated by GeoServer
should be smaller and perform somewhat better
(we have a few heuristics to get a small but effective index, avoid
isolated laves with single records and the like).
A recent version of uDig will generate the same .qix file as GeoServer
though

Hi Andrea,

Your remark aroused my curiosity. I have indeed identified a small addition in
the geotools port of shapelib’s shptree.c that must explain that optimization
and adapted it to shptree.c. I’ve not verified however if the geotools .qix and
shapelib .qix are now identical, but I’ve seen that the new code path is
triggered in some conditions.

See http://trac.osgeo.org/gdal/ticket/4472 for the patch.

Yep, that’s the one. I’ve noticed it makes a significant difference when indexing
very large files, the standard quadtree would end up with lots of these useless
nodes.

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf