[Geoserver-devel] uDIG request to GeoServer breaks DB2DataStore

I'm not sure where the problem is arising, but a getBounds request is being issued with the geometry column name in the filter set to null which is not helpful. Also, the BBOX request is turned into an 'intersects' filter which is not as efficient.

This is the request that is going from uDIG to GeoServer:
http://127.0.0.1:8080/geoserver/wfs?SERVICE=WFS&VERSION=1.0.0&REQUEST=GetFeature&BBOX=-122.0,37.1,-121.8,37.4&TYPENAME=topp:SCHOOLS

The GeoServer trace shows:
[INFO] org.vfny.geoserver.servlets.AbstractService - handling request:
Request: null
output format:GML2
max features:2147483647
version:1.0.0
queries:
  Query
   feature type: topp:SCHOOLS
   filter: [ null intersects POLYGON ((-122 37.1, -122 37.4, -121.8 37.4, -121.8 37.1, -122 37.1)) ]
   [properties: ALL ]

With the geometry column as null, no spatial predicate is applied which means all rows are returned and then the bounds and the result set filtered programmatically.

Am I missing something?

I can probably rework the filter to use the default geometry column but this seems a bit messy.

David,

I would guess that the DB2 datastore is not setting the default
geometry column. This would cause a BBOX http-get request (which
doesnt specify a column) to fail in the manner you're seeing.

Just a guess,
dave

On 1/23/07, David Adler <dadler@anonymised.com> wrote:

I'm not sure where the problem is arising, but a getBounds request is
being issued with the geometry column name in the filter set to null
which is not helpful. Also, the BBOX request is turned into an
'intersects' filter which is not as efficient.

This is the request that is going from uDIG to GeoServer:
http://127.0.0.1:8080/geoserver/wfs?SERVICE=WFS&VERSION=1.0.0&REQUEST=GetFeature&BBOX=-122.0,37.1,-121.8,37.4&TYPENAME=topp:SCHOOLS

The GeoServer trace shows:
[INFO] org.vfny.geoserver.servlets.AbstractService - handling request:
Request: null
output format:GML2
max features:2147483647
version:1.0.0
queries:
  Query
   feature type: topp:SCHOOLS
   filter: [ null intersects POLYGON ((-122 37.1, -122 37.4, -121.8
37.4, -121.8 37.1, -122 37.1)) ]
   [properties: ALL ]

With the geometry column as null, no spatial predicate is applied which
means all rows are returned and then the bounds and the result set
filtered programmatically.

Am I missing something?

I can probably rework the filter to use the default geometry column but
this seems a bit messy.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Where does this need to be set or where is GeoServer looking for this?

If it had the default geometry column would it use a BBOX filter instead of an intersects filter?

FeatureSource.getSchema().getDefaultGeometry() returns the default geometry column.

An unrelated problem is that when a FeatureType is defined in the GeoServer GUI, the SRS comes up as 0 although this is available via the DB2DataStore.
In the GUI, the geometry column is appearing as a recognized type such as 'pointProperty'.

David Blasby wrote:

David,

I would guess that the DB2 datastore is not setting the default
geometry column. This would cause a BBOX http-get request (which
doesnt specify a column) to fail in the manner you're seeing.

Just a guess,
dave

On 1/23/07, David Adler <dadler@anonymised.com> wrote:

I'm not sure where the problem is arising, but a getBounds request is
being issued with the geometry column name in the filter set to null
which is not helpful. Also, the BBOX request is turned into an
'intersects' filter which is not as efficient.

This is the request that is going from uDIG to GeoServer:
http://127.0.0.1:8080/geoserver/wfs?SERVICE=WFS&VERSION=1.0.0&REQUEST=GetFeature&BBOX=-122.0,37.1,-121.8,37.4&TYPENAME=topp:SCHOOLS

The GeoServer trace shows:
[INFO] org.vfny.geoserver.servlets.AbstractService - handling request:
Request: null
output format:GML2
max features:2147483647
version:1.0.0
queries:
  Query
   feature type: topp:SCHOOLS
   filter: [ null intersects POLYGON ((-122 37.1, -122 37.4, -121.8
37.4, -121.8 37.1, -122 37.1)) ]
   [properties: ALL ]

With the geometry column as null, no spatial predicate is applied which
means all rows are returned and then the bounds and the result set
filtered programmatically.

Am I missing something?

I can probably rework the filter to use the default geometry column but
this seems a bit messy.

-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

David Adler ha scritto:

Where does this need to be set or where is GeoServer looking for this?

If it had the default geometry column would it use a BBOX filter instead of an intersects filter?

> FeatureSource.getSchema().getDefaultGeometry() returns the default
> geometry column.

A geometry column is needed because otherwise we don't know what to
use in order to build the geometry filter.
But you're right, for some strange reason, Geoserver is building an
intersect filter instead of a bbox one (KvpRequestReader line 136...).
Does anyone know a reason for this? If no one steps up, I'll change
this into a bbox filter.

An unrelated problem is that when a FeatureType is defined in the GeoServer GUI, the SRS comes up as 0 although this is available via the DB2DataStore.
In the GUI, the geometry column is appearing as a recognized type such as 'pointProperty'.

Yeah, there's a reason, that's because GEOS-844 is not fixed yet :slight_smile:
Generally speaking, we do get a CRS from the datastore, but this may
or may not contain an SRS number (and sometimes, we don't get a CRS at all).
In order to convert it to an SRS number we have to try an extract an SRS
number from it (if possible and available) and make sure the CRS
with that number is the same as the one stored in the EPSG database (the
WKT may come from a .prj file and it may lie), if we don't get a number, we need to lookup one by parameter comparison.
The issue here is basically that whilst the OGC spec do cite the EPSG
numbers as a reference, many GIS systems and users don't know about them
or simply disregard them.

Cheers
Andrea

Andrea Aime ha scritto:

A geometry column is needed because otherwise we don't know what to
use in order to build the geometry filter.
But you're right, for some strange reason, Geoserver is building an
intersect filter instead of a bbox one (KvpRequestReader line 136...).

Sorry, that was org.vfny.geoserver.wfs.requests.WfsKvpRequestReader, line 136.

Cheers
Andrea

Andrea, thank you for the pointer into the GeoServer code. I see where it always uses GEOMETRY_INTERSECTS as the filter when the HTTP request is for BBOX. It would certainly be better for us if this were changed to a GEOMETRY_BBOX filter although I don't know if it breaks anyone else's code.

In looking at the code, I noticed a comment that the left expression is left as null because the PostGISDataSource will put in the default geom. I don't know that it is documented that if the left expression of a filter is null that the handling GeoTools code should use the default geom. This is actually what I had already modified the DB2 code to do. It isn't quite a safe to unconditionally convert INTERSECTS filters to BBOX.

In regards to the SRS and EPSG numbers, we have the same consideration in DB2. We have an 'srid' number specific to the geometry constructors and a coordinate system identifier that generally corresponds to the EPSG numbers. The DB2DataStore maintains both for each geometry column but it isn't clear what class/method should be providing the value that GeoServer is interested in.

Andrea Aime wrote:

David Adler ha scritto:

Where does this need to be set or where is GeoServer looking for this?

If it had the default geometry column would it use a BBOX filter instead of an intersects filter?

> FeatureSource.getSchema().getDefaultGeometry() returns the default
> geometry column.

A geometry column is needed because otherwise we don't know what to
use in order to build the geometry filter.
But you're right, for some strange reason, Geoserver is building an
intersect filter instead of a bbox one (KvpRequestReader line 136...).
Does anyone know a reason for this? If no one steps up, I'll change
this into a bbox filter.

An unrelated problem is that when a FeatureType is defined in the GeoServer GUI, the SRS comes up as 0 although this is available via the DB2DataStore.
In the GUI, the geometry column is appearing as a recognized type such as 'pointProperty'.

Yeah, there's a reason, that's because GEOS-844 is not fixed yet :slight_smile:
Generally speaking, we do get a CRS from the datastore, but this may
or may not contain an SRS number (and sometimes, we don't get a CRS at all).
In order to convert it to an SRS number we have to try an extract an SRS
number from it (if possible and available) and make sure the CRS
with that number is the same as the one stored in the EPSG database (the
WKT may come from a .prj file and it may lie), if we don't get a number, we need to lookup one by parameter comparison.
The issue here is basically that whilst the OGC spec do cite the EPSG
numbers as a reference, many GIS systems and users don't know about them
or simply disregard them.

Cheers
Andrea

David Adler ha scritto:

Andrea, thank you for the pointer into the GeoServer code. I see where it always uses GEOMETRY_INTERSECTS as the filter when the HTTP request is for BBOX. It would certainly be better for us if this were changed to a GEOMETRY_BBOX filter although I don't know if it breaks anyone else's code.

I'm going to wait a little more to allow other developer to step
forward and provide a reason for INTERSECTION, otherwise I'll change
that to bbox.

In looking at the code, I noticed a comment that the left expression is left as null because the PostGISDataSource will put in the default geom.

Yeah, that surprised me too the first time. Apparently that's a non written contract in data stores, most of them do behave like that.

In regards to the SRS and EPSG numbers, we have the same consideration in DB2. We have an 'srid' number specific to the geometry constructors and a coordinate system identifier that generally corresponds to the EPSG numbers. The DB2DataStore maintains both for each geometry column but it isn't clear what class/method should be providing the value that GeoServer is interested in.

Because there's none. And that "generally" is what really breaks things, I mean, either I can write code that trust the EPSG number you provide me becuase it is in the EPSG database, or not. In general, I cannot
trust that (lots of broken WKT around), so I have to fall back on:
* gathering an EPSG number from your wkt, if you provided it, and check
   the official EPSG database has it and it matches the definition you
   provide
* failing the first, look up the EPSG number by comparison, that is,
   try and find some definition with the same parameters

When Geoserver will switch to Geotools trunk we'll be able to gather
up the work Martin did on the chained EPSG provider, and thus set
up provider chains that include the EPSG codes defined in other source,
such as user provided property files, or whatever DB2 extension is
needed (a DB2 epsg provider would have to be written in that case).

Cheers
Andrea

Andrea/Dave,

I think, in geotools, GEOMETRY_BBOX does a bounding-box only search,
and GEOMETRY_INTERSECTS does a "real" intersection.

I'm pretty sure that the OGC BBOX filter is just syntactical sugar for
GEOMETRY_INTERSECTS so you don't have to type the polygon's GML. So,
if you implement OGC BBOX with a geotools BBOX you'll actually be
getting the wrong answer.

The datastore should be optimizing a "geometry_column intersects
geometry" by doing the index search ("&&" in postgis) plus ensuring
that the features do actually intersect the given bounding box (which
you should be calling a "rectangle represented by a polygon").

I brought up this point before a few times, but the consensus was to
leave it the way it was because fixing it would mean there's no way to
just do quick bounding-box only searchs.

I'd give you a reference to the OGC documentation, but their website
is down at the moment.

dave
ps. please check the OGC doc as I dont have a copy of it on this machine.

On 1/24/07, Andrea Aime <aaime@anonymised.com> wrote:

David Adler ha scritto:
> Andrea, thank you for the pointer into the GeoServer code. I see where
> it always uses GEOMETRY_INTERSECTS as the filter when the HTTP request
> is for BBOX. It would certainly be better for us if this were changed
> to a GEOMETRY_BBOX filter although I don't know if it breaks anyone
> else's code.

I'm going to wait a little more to allow other developer to step
forward and provide a reason for INTERSECTION, otherwise I'll change
that to bbox.

> In looking at the code, I noticed a comment that the left expression is
> left as null because the PostGISDataSource will put in the default
> geom.

Yeah, that surprised me too the first time. Apparently that's a non
written contract in data stores, most of them do behave like that.

> In regards to the SRS and EPSG numbers, we have the same consideration
> in DB2. We have an 'srid' number specific to the geometry constructors
> and a coordinate system identifier that generally corresponds to the
> EPSG numbers. The DB2DataStore maintains both for each geometry column
> but it isn't clear what class/method should be providing the value that
> GeoServer is interested in.

Because there's none. And that "generally" is what really breaks things,
I mean, either I can write code that trust the EPSG number you provide
me becuase it is in the EPSG database, or not. In general, I cannot
trust that (lots of broken WKT around), so I have to fall back on:
* gathering an EPSG number from your wkt, if you provided it, and check
   the official EPSG database has it and it matches the definition you
   provide
* failing the first, look up the EPSG number by comparison, that is,
   try and find some definition with the same parameters

When Geoserver will switch to Geotools trunk we'll be able to gather
up the work Martin did on the chained EPSG provider, and thus set
up provider chains that include the EPSG codes defined in other source,
such as user provided property files, or whatever DB2 extension is
needed (a DB2 epsg provider would have to be written in that case).

Cheers
Andrea

David Blasby ha scritto:

Andrea/Dave,

I think, in geotools, GEOMETRY_BBOX does a bounding-box only search,
and GEOMETRY_INTERSECTS does a "real" intersection.

I think you're indeed right.

I'm pretty sure that the OGC BBOX filter is just syntactical sugar for
GEOMETRY_INTERSECTS so you don't have to type the polygon's GML. So,
if you implement OGC BBOX with a geotools BBOX you'll actually be
getting the wrong answer.

The datastore should be optimizing a "geometry_column intersects
geometry" by doing the index search ("&&" in postgis) plus ensuring
that the features do actually intersect the given bounding box (which
you should be calling a "rectangle represented by a polygon").

I brought up this point before a few times, but the consensus was to
leave it the way it was because fixing it would mean there's no way to
just do quick bounding-box only searchs.

I'd give you a reference to the OGC documentation, but their website
is down at the moment.

Dave,
many thanks for the insight, I'll have a look at the docs.
See ya
Andrea

From my perspective, the OGC specification is a bit ambiguous, although maybe I just can't find the appropriate section.
The WFS implementation spec V1.1 states:
The bounding box parameter, BBOX, is included in this specification for convenience as a
shorthand representation of the very common a bounding box filter which would be expressed
in much longer form using XML and the filter encoding described in [3]. A BBOX applies to
all feature types listed in the request.

I could not find a definition of "bounding box filter". This citation is the only instance of this phrase in the WFS spec.
They could have said that this is a shorthand representation of an intersects filter where the target is a rectangle if
the intent was to do a precise intersection filter.
But I don't know what they were thinking.

The description of a GeoTools BBOX filter at
http://docs.codehaus.org/display/GEOTOOLS/Using+a+bounding+box+Filter
refers to "features returned to those which are not wholy (sic) outside the bounding box".
The DB2 implementation will return a diagonal line outside the bounding box but whose envelope intersects
the bounding box.

It isn't clear whether there is an intention for a bounding box filter to be semantically different from an intersects
filter with a target polygon which is a rectangle.

In DB2, we tend to treat "bounding box filter" as the implementation in our EnvelopesIntersect predicate which returns
all features whose envelope intersects a rectangular window defined by the lower-left and upper-right coordinates.
If a spatial index is defined, we return all rows resulting from a successful index scan. (GeoTools BBOX equiv)

For the intersects predicate, we do first filter using an index scan but then each candidate geometry needs
to be compared precisely against the target geometry which is orders of magnitude more computationally intensive.

I assume that it has been brought up before that each spatial query needs to be done twice, once for the bounding
box at the beginning of the GML and then again to get the actual features.

David Blasby wrote:

Andrea/Dave,

I think, in geotools, GEOMETRY_BBOX does a bounding-box only search,
and GEOMETRY_INTERSECTS does a "real" intersection.

I'm pretty sure that the OGC BBOX filter is just syntactical sugar for
GEOMETRY_INTERSECTS so you don't have to type the polygon's GML. So,
if you implement OGC BBOX with a geotools BBOX you'll actually be
getting the wrong answer.

The datastore should be optimizing a "geometry_column intersects
geometry" by doing the index search ("&&" in postgis) plus ensuring
that the features do actually intersect the given bounding box (which
you should be calling a "rectangle represented by a polygon").

I brought up this point before a few times, but the consensus was to
leave it the way it was because fixing it would mean there's no way to
just do quick bounding-box only searchs.

I'd give you a reference to the OGC documentation, but their website
is down at the moment.

dave
ps. please check the OGC doc as I dont have a copy of it on this machine.

On 1/24/07, Andrea Aime <aaime@anonymised.com> wrote:

David Adler ha scritto:
> Andrea, thank you for the pointer into the GeoServer code. I see where
> it always uses GEOMETRY_INTERSECTS as the filter when the HTTP request
> is for BBOX. It would certainly be better for us if this were changed
> to a GEOMETRY_BBOX filter although I don't know if it breaks anyone
> else's code.

I'm going to wait a little more to allow other developer to step
forward and provide a reason for INTERSECTION, otherwise I'll change
that to bbox.

> In looking at the code, I noticed a comment that the left expression is
> left as null because the PostGISDataSource will put in the default
> geom.

Yeah, that surprised me too the first time. Apparently that's a non
written contract in data stores, most of them do behave like that.

> In regards to the SRS and EPSG numbers, we have the same consideration
> in DB2. We have an 'srid' number specific to the geometry constructors
> and a coordinate system identifier that generally corresponds to the
> EPSG numbers. The DB2DataStore maintains both for each geometry column
> but it isn't clear what class/method should be providing the value that
> GeoServer is interested in.

Because there's none. And that "generally" is what really breaks things,
I mean, either I can write code that trust the EPSG number you provide
me becuase it is in the EPSG database, or not. In general, I cannot
trust that (lots of broken WKT around), so I have to fall back on:
* gathering an EPSG number from your wkt, if you provided it, and check
   the official EPSG database has it and it matches the definition you
   provide
* failing the first, look up the EPSG number by comparison, that is,
   try and find some definition with the same parameters

When Geoserver will switch to Geotools trunk we'll be able to gather
up the work Martin did on the chained EPSG provider, and thus set
up provider chains that include the EPSG codes defined in other source,
such as user provided property files, or whatever DB2 extension is
needed (a DB2 epsg provider would have to be written in that case).

Cheers
Andrea

David Adler ha scritto:

From my perspective, the OGC specification is a bit ambiguous, although maybe I just can't find the appropriate section.

The appropriate definition should be the one in the Filter Encoding 1.1 spec:

"The <BBOX> element is defined as a convenient and more compact way of encoding the very common bounding box constraint based on the gml:Envelope geometry. It is equivalent to the spatial operation <Not><Disjoint> … </Disjoint></Not> meaning that the <BBOX> operator should identify all geometries that spatially interact with the box."

The description of a GeoTools BBOX filter at
http://docs.codehaus.org/display/GEOTOOLS/Using+a+bounding+box+Filter
refers to "features returned to those which are not wholy (sic) outside the bounding box".

Same semantics as the spec it seems

The DB2 implementation will return a diagonal line outside the bounding box but whose envelope intersects
the bounding box.

That's what we call the "loose bbox" filter in the postgis datastore.
If you enable loose bbox in the settings you'll have the same behaviour
as EnvelopesIntersect. This is good for WMS bboxes, but not compliant for WFS GetFeature requests I fear.

It isn't clear whether there is an intention for a bounding box filter to be semantically different from an intersects
filter with a target polygon which is a rectangle.

In DB2, we tend to treat "bounding box filter" as the implementation in our EnvelopesIntersect predicate which returns
all features whose envelope intersects a rectangular window defined by the lower-left and upper-right coordinates.
If a spatial index is defined, we return all rows resulting from a successful index scan. (GeoTools BBOX equiv)

For the intersects predicate, we do first filter using an index scan but then each candidate geometry needs
to be compared precisely against the target geometry which is orders of magnitude more computationally intensive.

That's what Postgis is doing as well. The spec is the spec, there's not
much we can do about it. One possible option may be to allow the WFS
to handle bbox predicates in a less strict way, that is "loose bbox" at the WFS level. It may be a reasonable option for WFS that is run
in house, in a controlled environment. I would not say the same for
a public facing server, where standard compliance is more important.

I assume that it has been brought up before that each spatial query needs to be done twice, once for the bounding
box at the beginning of the GML and then again to get the actual features.

In fact I'm not happy about it neither, but we cannot stream out results
and do a single query at the same time. By the time we have read all features, the GML has already taken the way of the client.

Yet, it seems to me the GML spec leaves a door open in order to avoid
computing the bbox:

"BoundingShapeType
This property describes the minimum bounding box or rectangle that encloses the entire feature. Its content
model is as follows:
<element name="boundedBy" type="gml:BoundingShapeType"/>
<complexType name="BoundingShapeType">
<sequence>
<choice>
<element ref="gml:Envelope"/>
<element ref="gml:Null"/>
</choice>
</sequence>
</complexType>
The gml:Envelope element is defined in clause 9.1.5. A value of gml:Null may appear if an extent is not applicable or not available for some reason."

Well, the reason for putting a GML:Null in the bbox may be a that the admin set a configuration option in the administration panel so that the bounds are not computed :-p

I would like to hear people opinion on both configuration options.
Cheers
Andrea