[Geoserver-devel] App-Schema performance

Hi Everyone,

I have lately been investigating into improving the performance of
app-schema. One area of concern is the building of features to count it and
rebuilding it again to encode it (therefore building it twice) which
obviously have performance overhead.

I have been testing out possible options and I am at a lost in how this can
be achieved. I was hoping someone in the community have some solutions.

These is some of the ideas I have tried and why I think it is not feasible.
Please do correct me if i am wrong

- Serializing the featurecollection when its counting therefore no need to
rebuild on encoding.
* The feature is a complex beast for serialization as it is deeply nested.
To serialize it, every object instance it reference must be serializable as
well. I tried JBoss Serialization which claimed that the objects are not
required to implement Serializable but that didn't seem like the case when i
was testing it.

- ignore the count and build the feature(while counting at the same time)
then updated the numberOfFeature after it has finish building.
* From walking through the code, the numberOfFeature is set in Encoder.start
line 1107: serializer.startElement(uri, local, qName, atts); The features
are stream and not cached due to possible memory overflow therefore is it
still possible to change the value after it has been streamed? If yes can
someone guide me where it can be achieved?

- Xstream
* Tried XStream as a mean to serialize objects into xml but Xsteam died on
while attempting the task. It has performanace overhead as well and not the
best solution.

- last resort
* instead of streaming the result out, perhaps I can stream the results to
disk, updated the numberOfFeatures then return it? but this will have impact
on the outputstrategy.

Please advice thanks :smiley:
--
View this message in context: http://old.nabble.com/App-Schema-performance-tp30283782p30283782.html
Sent from the GeoServer - Dev mailing list archive at Nabble.com.

"The optional numberOfFeatures attribute is used to indicate the
number of features that are in the response document."

Why not just ignore it by default (maybe a configuration flag to force it).

Does anyone know of any clients that rely on it (even though its optional?)

Rob

On Tue, Nov 23, 2010 at 7:58 PM, VT@anonymised.com <victor.tey@anonymised.com> wrote:

Hi Everyone,

I have lately been investigating into improving the performance of
app-schema. One area of concern is the building of features to count it and
rebuilding it again to encode it (therefore building it twice) which
obviously have performance overhead.

I have been testing out possible options and I am at a lost in how this can
be achieved. I was hoping someone in the community have some solutions.

These is some of the ideas I have tried and why I think it is not feasible.
Please do correct me if i am wrong

- Serializing the featurecollection when its counting therefore no need to
rebuild on encoding.
* The feature is a complex beast for serialization as it is deeply nested.
To serialize it, every object instance it reference must be serializable as
well. I tried JBoss Serialization which claimed that the objects are not
required to implement Serializable but that didn't seem like the case when i
was testing it.

- ignore the count and build the feature(while counting at the same time)
then updated the numberOfFeature after it has finish building.
* From walking through the code, the numberOfFeature is set in Encoder.start
line 1107: serializer.startElement(uri, local, qName, atts); The features
are stream and not cached due to possible memory overflow therefore is it
still possible to change the value after it has been streamed? If yes can
someone guide me where it can be achieved?

- Xstream
* Tried XStream as a mean to serialize objects into xml but Xsteam died on
while attempting the task. It has performanace overhead as well and not the
best solution.

- last resort
* instead of streaming the result out, perhaps I can stream the results to
disk, updated the numberOfFeatures then return it? but this will have impact
on the outputstrategy.

Please advice thanks :smiley:
--
View this message in context: http://old.nabble.com/App-Schema-performance-tp30283782p30283782.html
Sent from the GeoServer - Dev mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

I think it's useless for normal WFS GetFeature requests.

Only for the resultType=hits requests it's useful IMHO.

Best regards,
Bart

"The optional numberOfFeatures attribute is used to indicate the
number of features that are in the response document."

Why not just ignore it by default (maybe a configuration flag to force
it).

Does anyone know of any clients that rely on it (even though its
optional?)

Rob

On Tue, Nov 23, 2010 at 7:58 PM, VT@anonymised.com <victor.tey@anonymised.com> wrote:

Hi Everyone,

I have lately been investigating into improving the performance of
app-schema. One area of concern is the building of features to count it
and
rebuilding it again to encode it (therefore building it twice) which
obviously have performance overhead.

I have been testing out possible options and I am at a lost in how this
can
be achieved. I was hoping someone in the community have some solutions.

These is some of the ideas I have tried and why I think it is not
feasible.
Please do correct me if i am wrong

- Serializing the featurecollection when its counting therefore no need
to
rebuild on encoding.
* The feature is a complex beast for serialization as it is deeply
nested.
To serialize it, every object instance it reference must be serializable
as
well. I tried JBoss Serialization which claimed that the objects are not
required to implement Serializable but that didn't seem like the case
when i
was testing it.

- ignore the count and build the feature(while counting at the same
time)
then updated the numberOfFeature after it has finish building.
* From walking through the code, the numberOfFeature is set in
Encoder.start
line 1107: serializer.startElement(uri, local, qName, atts); The
features
are stream and not cached due to possible memory overflow therefore is
it
still possible to change the value after it has been streamed? If yes
can
someone guide me where it can be achieved?

- Xstream
* Tried XStream as a mean to serialize objects into xml but Xsteam died
on
while attempting the task. It has performanace overhead as well and not
the
best solution.

- last resort
* instead of streaming the result out, perhaps I can stream the results
to
disk, updated the numberOfFeatures then return it? but this will have
impact
on the outputstrategy.

Please advice thanks :smiley:
--
View this message in context:
http://old.nabble.com/App-Schema-performance-tp30283782p30283782.html
Sent from the GeoServer - Dev mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for
grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On Tue, Nov 23, 2010 at 9:58 AM, VT@anonymised.com <victor.tey@anonymised.com> wrote:

- last resort
* instead of streaming the result out, perhaps I can stream the results to
disk, updated the numberOfFeatures then return it? but this will have impact
on the outputstrategy.

I think there is a solution that can avoid us messing with the output strategy.
Take the feature collections and wrap them in a collection that will count
and generate the bounds as the encoder streams through it (so you'll
have to wrap both the collection and the feature iterator).

Then let the encoder write out the xml, but instead of writing it out to
the output stream, write it out to a temporary file.

Then extract the count and bounds and apply an xslt transformation that
will just add the count attribute and the collection bounds element to
the result, this time piping the xslt transformation out to the output
stream. The transformation should be really easy to write, as it goes
to copy all elements, and should be fast, as the only non copy part
is targeted to very specific elements (that you'll refer to by absolute
xpath).

If the admin chose the full buffering output strategy the net result wil
be to write the xml on disk twice, but imho, such admin does not care
about performance much anyways, so no big deal imho.

Cheers
Andrea

-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

On Tue, Nov 23, 2010 at 10:34 AM, Rob Atkinson <robatkinson101@anonymised.com> wrote:

"The optional numberOfFeatures attribute is used to indicate the
number of features that are in the response document."

Why not just ignore it by default (maybe a configuration flag to force it).

Does anyone know of any clients that rely on it (even though its optional?)

I'm also not aware of many clients using it. I think I've seen some openlayers
based application use the collection bounds, but never the count.

Cheers
Andrea

-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------