[Geoserver-devel] On loose bbox usage

Hi,
a recent user report about bbox treatment in GeoServer
(http://jira.codehaus.org/browse/GEOS-1768)
prompted me to make some checks on how loose bbox
is affecting performances.

For those that do not know, "loose bbox" is a postgis
datastore parameter that allows the datastore to
relax the bbox semantics. The OGC filter spec say that
BBOX is just a shortcut for intersects, meaning that
the geometry tested must actually intersect the specified
bbox.
Loose bbox relaxes the semantics allowing for a bbox vs
bbox check, that is, a geometry is included in the results
if its bbox.
Know, it's well know a loose bbox semantic will provide
faster filter checks, but I did not know how much faster.

So I run the same WMS test used in Victoria's WMS shootout
presentation assessing how much fast bbox and antialiasing
are affecting speed (if you did not see the presentation...
can you tell me on what planet you're living? ok, anyways,
here's the link: http://www.foss4g2007.org/presentations/view.php?abstract_id=120)

I included antialias because I know disabling it provides
a very significant perf boost, but not many people disable
it due to the poorer image quality).
To recap, the tests makes a set of WMS requests
drawing some 1000 features out of a 3 million dataset.
Here are the results (times in milliseconds as reported
by JMeter for a single thread requesting maps):

no antialias + loose bbox: 30ms
    antialias + loose bbox: 80ms
no antialias + ogc bbox : 200ms
    antialias + ogc bbox : 250ms

Wow! As you can see loose bbox as an optimization is
quite a bit more significant than disabling antialiasing.

Now, this means all the poor Oracle and DB2 users (and
maybe ArcSDE ones too) are suffering from quite diminished
performance due to the lack of such parameter in those
datastores.

The user complaining at
http://jira.codehaus.org/browse/GEOS-1768
has good reasons to, he can get correct
WFS behaviour, but he'll have to pay a dear price on the
WMS side.

So, what can we do about it? I have a few proposals:
a) GeoServer wrap-o-rama road: we do add fast bbox
    options to datastores that do lack it, when WFS
    is used, we grab the filters and replace each
    BBOX instance with the equivalent Intersects
    instance to keep OGC semantics happy
b) GeoTools hint road: as suggested in
    http://jira.codehaus.org/browse/GEOS-1762
    we turn fast bbox into a datastore hint that
    only renderer sets
c) GeoTools "we know better than OGC" road:
    we create a new Filter interface, LooseBBox,
    with an explicit loose semantic, teach datastores
    about it, and leave the OGC bbox usage unaltered

a) and b) are easier to follow from a political point
of view, but have a hole when an SLD style or an
extra filter provided on WMS requests is using BBOX,
we'd end up treating it in the loose way too.
Solution c) is not affected by that.

Comments, opinions, rants?
Cheers
Andrea

Andrea Aime ha scritto:

b) GeoTools hint road: as suggested in
    http://jira.codehaus.org/browse/GEOS-1762
    we turn fast bbox into a datastore hint that
    only renderer sets

By datastore hint I really mean "query hint" here,
like the other one about packed coordinates sequences
that postgis is already using to good effect.
Cheers
Andrea

Yup, query hint seems the better to me.
And ArcSDE would be able to use loose bbox, though is not doing so right now,
I'll create a jira issue.

Thanks for the hint :slight_smile:

Gabriel
On Tuesday 11 March 2008 11:34:01 am Andrea Aime wrote:

Andrea Aime ha scritto:
> b) GeoTools hint road: as suggested in
> http://jira.codehaus.org/browse/GEOS-1762
> we turn fast bbox into a datastore hint that
> only renderer sets

By datastore hint I really mean "query hint" here,
like the other one about packed coordinates sequences
that postgis is already using to good effect.
Cheers
Andrea

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:4045,47d66038221567082231907!

DB2 users aren't suffering a performance hit because we chose to implement the "loose BBOX" as the default due to the dramatic difference in performance which is similar to what was described in the "shootout".

I think GeoServer and GeoTools should provide an option to specify which semantics to use for BBOX.

Regards,
David

Andrea Aime wrote:

Hi,
a recent user report about bbox treatment in GeoServer
(http://jira.codehaus.org/browse/GEOS-1768)
prompted me to make some checks on how loose bbox
is affecting performances.

For those that do not know, "loose bbox" is a postgis
datastore parameter that allows the datastore to
relax the bbox semantics. The OGC filter spec say that
BBOX is just a shortcut for intersects, meaning that
the geometry tested must actually intersect the specified
bbox.
Loose bbox relaxes the semantics allowing for a bbox vs
bbox check, that is, a geometry is included in the results
if its bbox.
Know, it's well know a loose bbox semantic will provide
faster filter checks, but I did not know how much faster.

So I run the same WMS test used in Victoria's WMS shootout
presentation assessing how much fast bbox and antialiasing
are affecting speed (if you did not see the presentation...
can you tell me on what planet you're living? ok, anyways,
here's the link: http://www.foss4g2007.org/presentations/view.php?abstract_id=120)

I included antialias because I know disabling it provides
a very significant perf boost, but not many people disable
it due to the poorer image quality).
To recap, the tests makes a set of WMS requests
drawing some 1000 features out of a 3 million dataset.
Here are the results (times in milliseconds as reported
by JMeter for a single thread requesting maps):

no antialias + loose bbox: 30ms
   antialias + loose bbox: 80ms
no antialias + ogc bbox : 200ms
   antialias + ogc bbox : 250ms

Wow! As you can see loose bbox as an optimization is
quite a bit more significant than disabling antialiasing.

Now, this means all the poor Oracle and DB2 users (and
maybe ArcSDE ones too) are suffering from quite diminished
performance due to the lack of such parameter in those
datastores.

The user complaining at
http://jira.codehaus.org/browse/GEOS-1768
has good reasons to, he can get correct
WFS behaviour, but he'll have to pay a dear price on the
WMS side.

So, what can we do about it? I have a few proposals:
a) GeoServer wrap-o-rama road: we do add fast bbox
   options to datastores that do lack it, when WFS
   is used, we grab the filters and replace each
   BBOX instance with the equivalent Intersects
   instance to keep OGC semantics happy
b) GeoTools hint road: as suggested in
   http://jira.codehaus.org/browse/GEOS-1762
   we turn fast bbox into a datastore hint that
   only renderer sets
c) GeoTools "we know better than OGC" road:
   we create a new Filter interface, LooseBBox,
   with an explicit loose semantic, teach datastores
   about it, and leave the OGC bbox usage unaltered

a) and b) are easier to follow from a political point
of view, but have a hole when an SLD style or an
extra filter provided on WMS requests is using BBOX,
we'd end up treating it in the loose way too.
Solution c) is not affected by that.

Comments, opinions, rants?
Cheers
Andrea

David Adler ha scritto:

DB2 users aren't suffering a performance hit because we chose to implement the "loose BBOX" as the default due to the dramatic difference in performance which is similar to what was described in the "shootout".

Ah, I see.
This means a WFS built on top of DB2 cannot be OGC compliant thought (ever tried running the cite tests on it?).

Also, if you're performance concerned about DB2, have you tried
implementing support for the coordinate sequence hints or to
check various way to move geometries from the db to the datastore?
Using WKB + a smart encoding of the binary format proved to be
very valuable to the postgis datastore.

I think GeoServer and GeoTools should provide an option to specify which semantics to use for BBOX.

Which can be done by using either option b) or option c) of my proposal.
Do you have a preference?
Cheers
Andrea

Just a clarification Andrea; when we do an intersects test are we placing a bbox check in front of that at the SQL level? Basically I would like to check that we are doing the right thing.

For WMS it sounds like it should rewrite the queries to error on the side of more data faster; you may also want to explore the trade off between performing the intersects check in the database vs in java as a post filter. Perhaps doing so would balance the work between two processes; JTS has on occasions like this been known to be faster than GEOS (if only because it can gabage collect in another thread).

Also if you have not done so already the PrepairedGeometry test is the *only* way to fly for this Intersects operation; I would not be surprised if a bbox request from the database coupled with a prepaired geometry test in Java offered a nice performance balance that could be realized by several data stores.

Cheers
Jody

Hi,
a recent user report about bbox treatment in GeoServer
(http://jira.codehaus.org/browse/GEOS-1768)
prompted me to make some checks on how loose bbox
is affecting performances.

For those that do not know, "loose bbox" is a postgis
datastore parameter that allows the datastore to
relax the bbox semantics. The OGC filter spec say that
BBOX is just a shortcut for intersects, meaning that
the geometry tested must actually intersect the specified
bbox.
Loose bbox relaxes the semantics allowing for a bbox vs
bbox check, that is, a geometry is included in the results
if its bbox.
Know, it's well know a loose bbox semantic will provide
faster filter checks, but I did not know how much faster.

So I run the same WMS test used in Victoria's WMS shootout
presentation assessing how much fast bbox and antialiasing
are affecting speed (if you did not see the presentation...
can you tell me on what planet you're living? ok, anyways,
here's the link: Foss Prog – Hedelmäpelit netissä)

I included antialias because I know disabling it provides
a very significant perf boost, but not many people disable
it due to the poorer image quality).
To recap, the tests makes a set of WMS requests
drawing some 1000 features out of a 3 million dataset.
Here are the results (times in milliseconds as reported
by JMeter for a single thread requesting maps):

no antialias + loose bbox: 30ms
    antialias + loose bbox: 80ms
no antialias + ogc bbox : 200ms
    antialias + ogc bbox : 250ms

Wow! As you can see loose bbox as an optimization is
quite a bit more significant than disabling antialiasing.

Now, this means all the poor Oracle and DB2 users (and
maybe ArcSDE ones too) are suffering from quite diminished
performance due to the lack of such parameter in those
datastores.

The user complaining at
http://jira.codehaus.org/browse/GEOS-1768
has good reasons to, he can get correct
WFS behaviour, but he'll have to pay a dear price on the
WMS side.

So, what can we do about it? I have a few proposals:
a) GeoServer wrap-o-rama road: we do add fast bbox
    options to datastores that do lack it, when WFS
    is used, we grab the filters and replace each
    BBOX instance with the equivalent Intersects
    instance to keep OGC semantics happy
b) GeoTools hint road: as suggested in
    http://jira.codehaus.org/browse/GEOS-1762
    we turn fast bbox into a datastore hint that
    only renderer sets
c) GeoTools "we know better than OGC" road:
    we create a new Filter interface, LooseBBox,
    with an explicit loose semantic, teach datastores
    about it, and leave the OGC bbox usage unaltered

a) and b) are easier to follow from a political point
of view, but have a hole when an SLD style or an
extra filter provided on WMS requests is using BBOX,
we'd end up treating it in the loose way too.
Solution c) is not affected by that.

Comments, opinions, rants?
Cheers
Andrea

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  

Jody Garnett ha scritto:

Just a clarification Andrea; when we do an intersects test are we placing a bbox check in front of that at the SQL level? Basically I would like to check that we are doing the right thing.

Yes we do, but when the first test succeds the intersection is
evaluated as well, and that slows down data gathering a lot.

For WMS it sounds like it should rewrite the queries to error on the side of more data faster; you may also want to explore the trade off between performing the intersects check in the database vs in java as a post filter. Perhaps doing so would balance the work between two processes; JTS has on occasions like this been known to be faster than GEOS (if only because it can gabage collect in another thread).

And how do you suggest we tell the sql encoder to issue an && filter
without keeping the bbox in the mix?

Also if you have not done so already the PrepairedGeometry test is the *only* way to fly for this Intersects operation; I would not be surprised if a bbox request from the database coupled with a prepaired geometry test in Java offered a nice performance balance that could be realized by several data stores.

This is something that has to be decided on a datastore per datastore
basis, not everyone is using GEOS and its very slow operations...
Cheers
Andrea

Andrea Aime ha scritto:

Also if you have not done so already the PrepairedGeometry test is the *only* way to fly for this Intersects operation; I would not be surprised if a bbox request from the database coupled with a prepaired geometry test in Java offered a nice performance balance that could be realized by several data stores.

This is something that has to be decided on a datastore per datastore
basis, not everyone is using GEOS and its very slow operations...

But this is completely out of the point... for WMS we want loose bbox,
period, we don't need a full intersection evaluation, fast as it is,
it'll add more overhead than trying to draw something that's out of
the screen with the java2d api imho (ok, this would need some tests
to make sure, but it's at least quite likely)

Cheers
Andrea

Andrea Aime wrote:

Andrea Aime ha scritto:

Also if you have not done so already the PrepairedGeometry test is the *only* way to fly for this Intersects operation; I would not be surprised if a bbox request from the database coupled with a prepaired geometry test in Java offered a nice performance balance that could be realized by several data stores.

This is something that has to be decided on a datastore per datastore basis, not everyone is using GEOS and its very slow operations...

Agreed, and yes your earlier suggestion of using a Query hint would be a fine way to explore this space; if the need really is great enough should this optimization be made public in some way? Or can it safely be hidden behind the renderer?

But this is completely out of the point... for WMS we want loose bbox,
period, we don't need a full intersection evaluation, fast as it is,
it'll add more overhead than trying to draw something that's out of
the screen with the java2d api imho (ok, this would need some tests
to make sure, but it's at least quite likely)

We really are gradually moving towards the JAI model; more hints that make all the difference in the world for performance considerations. I really just wish I could have the "fast vs slow" dial :slight_smile:

Jody

Jody Garnett ha scritto:

Andrea Aime wrote:

Andrea Aime ha scritto:

Also if you have not done so already the PrepairedGeometry test is the *only* way to fly for this Intersects operation; I would not be surprised if a bbox request from the database coupled with a prepaired geometry test in Java offered a nice performance balance that could be realized by several data stores.

This is something that has to be decided on a datastore per datastore basis, not everyone is using GEOS and its very slow operations...

Agreed, and yes your earlier suggestion of using a Query hint would be a fine way to explore this space; if the need really is great enough should this optimization be made public in some way? Or can it safely be hidden behind the renderer?

Well, since datastores must know about it, I guess it should be made public?

But this is completely out of the point... for WMS we want loose bbox,
period, we don't need a full intersection evaluation, fast as it is,
it'll add more overhead than trying to draw something that's out of
the screen with the java2d api imho (ok, this would need some tests
to make sure, but it's at least quite likely)

We really are gradually moving towards the JAI model; more hints that make all the difference in the world for performance considerations. I really just wish I could have the "fast vs slow" dial :slight_smile:

Eh, but fast comes with a price semantic wise, that's why we need a
custom hint. Yet a very generic hint like Hints.USE_LOOSE_BBOX will
make it impossible to have proper semantic for OGC filters using
BBOX in the SLD.... I'm not very concerned about that, seems like
a very small corner case... what do you think?
Another option could be like hints.put(USE_LOOSE_BBOX, myBboxFilter)
to point at a specific one, or
hints.put(USE_LOOSE_BBOX, myBboxFilters)
where myBboxFilters is a Collection<BBOX> to point to a list of
filters that do need the special treatment (more and more complex...)

Maybe a LooseBBOX interface in gt2, and we extend BBOX so that
those that do not know about it can just apply the OGC semantic?
This one would be messy as well... people may think of giving
an order to use the loose semantic and then a random sql encoder
might just encode it as intersects... sigh, what a mess

Cheers
Andrea

Andrea Aime wrote:

Well, since datastores must know about it, I guess it should be made public?

Sorry for the lack of clarity; you are correct the Hint needs to be public. I was thinking in terms of a WFS vendor specific extention ... ie what is good for WMS is good for WFS?

Eh, but fast comes with a price semantic wise, that's why we need a
custom hint. Yet a very generic hint like Hints.USE_LOOSE_BBOX will
make it impossible to have proper semantic for OGC filters using
BBOX in the SLD.... I'm not very concerned about that, seems like
a very small corner case... what do you think?

My attempt at humor failed :frowning: It was an attempt to poke fun at the complexity of using Hints in practice.
As far as I can tell LOOSE_BBOX always >= Intersects so for the WMS case it will not matter.

Another option could be like hints.put(USE_LOOSE_BBOX, myBboxFilter)
to point at a specific one, or hints.put(USE_LOOSE_BBOX, myBboxFilters)
where myBboxFilters is a Collection<BBOX> to point to a list of
filters that do need the special treatment (more and more complex...)

A simple True/False will be fine for the USE_LOOSE_BBOX hint.

Maybe a LooseBBOX interface in gt2, and we extend BBOX so that
those that do not know about it can just apply the OGC semantic?

That is another nice way to fly; the Hint would be applied to the geotools FilterFactory. Nice work I like it.

This one would be messy as well... people may think of giving an order to use the loose semantic and then a random sql encoder
might just encode it as intersects... sigh, what a mess

Still that is the same as a random datastore not respecting the Hint right? Ah but your datastores advertise what they support - I still need to review the code example to understand how you did that.

Jody

Jody Garnett ha scritto:

Andrea Aime wrote:

...

Maybe a LooseBBOX interface in gt2, and we extend BBOX so that
those that do not know about it can just apply the OGC semantic?

That is another nice way to fly; the Hint would be applied to the geotools FilterFactory. Nice work I like it.

So you would use:
FilterFactory ff = CommonFactoryFinder.getFilterFactory(new Hints(Hints.LOOSE_BBOX, true));
Gt2BBOX box = (Gt2BBOX) ff.bbox(...);

where Gt2BBOX would be:

public interface Gt2BBOX extends BBOX {
   public boolean isLoose();
}

and then sql encoders and the like could check for the interface
and the return type in order to decide how to encode the filter?

This one would be messy as well... people may think of giving an order to use the loose semantic and then a random sql encoder
might just encode it as intersects... sigh, what a mess

Still that is the same as a random datastore not respecting the Hint right? Ah but your datastores advertise what they support - I still need to review the code example to understand how you did that.

FeatureSource.getSupportedHints() -> Set<RenderingHint.Key>

Cheers
Andrea

Andrea Aime wrote:

So you would use:
FilterFactory ff = CommonFactoryFinder.getFilterFactory(new Hints(Hints.LOOSE_BBOX, true));
Gt2BBOX box = (Gt2BBOX) ff.bbox(...);

where Gt2BBOX would be:

public interface Gt2BBOX extends BBOX {
  public boolean isLoose();
}

and then sql encoders and the like could check for the interface and the return type in order to decide how to encode the filter?

That works; I was thinking of taking it over to the renderer (since the definition of the style is separate from how we are using it).
FilterFactory ff = CommonFactoryFinder.getFilterFactory(new Hints(Hints.LOOSE_BBOX, true));
DuplicatorFilterVisitor copy = new DuplicatorFilterVisitor( ff );

And then for each filter used:

filter.accepts( visitor );
filter = visitor.getCopy();

Jody

This one would be messy as well... people may think of giving an order to use the loose semantic and then a random sql encoder
might just encode it as intersects... sigh, what a mess

Still that is the same as a random datastore not respecting the Hint right? Ah but your datastores advertise what they support - I still need to review the code example to understand how you did that.

FeatureSource.getSupportedHints() -> Set<RenderingHint.Key>

Cheers
Andrea

Jody Garnett ha scritto:

Andrea Aime wrote:

So you would use:
FilterFactory ff = CommonFactoryFinder.getFilterFactory(new Hints(Hints.LOOSE_BBOX, true));
Gt2BBOX box = (Gt2BBOX) ff.bbox(...);

where Gt2BBOX would be:

public interface Gt2BBOX extends BBOX {
  public boolean isLoose();
}

and then sql encoders and the like could check for the interface and the return type in order to decide how to encode the filter?

That works; I was thinking of taking it over to the renderer (since the definition of the style is separate from how we are using it).
FilterFactory ff = CommonFactoryFinder.getFilterFactory(new Hints(Hints.LOOSE_BBOX, true));
DuplicatorFilterVisitor copy = new DuplicatorFilterVisitor( ff );

And then for each filter used:

filter.accepts( visitor );
filter = visitor.getCopy();

Ouch no, this would be bad!
If the SLD has a filter that uses BBOX, we want it to behave like
OGC recomended, the above would turn it into a loose one too.
The idea I had was to only create the bbox filter specifically
for paint area catching, and leave whatever extra filter was
specified thru SLD or CQL as is.

Cheers
Andrea

Andrea Aime wrote:

Ouch no, this would be bad!
If the SLD has a filter that uses BBOX, we want it to behave like
OGC recomended, the above would turn it into a loose one too.
The idea I had was to only create the bbox filter specifically
for paint area catching, and leave whatever extra filter was
specified thru SLD or CQL as is.

I miss understood you; I thought you were going to feed your FilterFactory with the BBOX Hint applied to the SLD parser.
Jody