[Geoserver-devel] more on improving test times

Hi all,

A while back I sent an email about the state of GeoServer testing, and how at current the full unit test suite takes a lot of time to run. Here is the email for reference.

http://www.mail-archive.com/geoserver-devel@anonymised.comceforge.net/msg17055.html

I have been pushing on some of the ideas there and started a proposal which can be found here:

http://geoserver.org/display/GEOS/GSIP+80±+Testing+Overhaul

By no means is the proposal in a complete state but I thought it would be beneficial to have something to help focus discussion around.

Feedback welcome. I hope to bring it up as a topic of discussion at the bi-weekly skype meeting but understand it is pretty short notice for people to review before then.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Sep 17, 2012 at 5:10 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,

A while back I sent an email about the state of GeoServer testing, and how at current the full unit test suite takes a lot of time to run. Here is the email for reference.

http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg17055.html

I have been pushing on some of the ideas there and started a proposal which can be found here:

http://geoserver.org/display/GEOS/GSIP+80±+Testing+Overhaul

By no means is the proposal in a complete state but I thought it would be beneficial to have something to help focus discussion around.

Feedback welcome. I hope to bring it up as a topic of discussion at the bi-weekly skype meeting but understand it is pretty short notice for people to review before then.

Hi,
had a quick look, wow, long proposal.

Random remarks:

  • generally speaking it’s looking good
  • even when running tests that do not alter the configuration during their execution
    we often add in the one time setup some extra layer, so I’m not sure how these
    would play against the idea of sharing tests in the same group
  • how would MockTest work, would it just fill an in memory catalog without
    any writing on disk?
  • either JUnit4 and TestNG work for me, I’m happy to use their extra when I need them
    and grumpy that I have to write more code to use them in the normal case
    (having to write the annotations instead of using name based conventions, add the static import)
    Wondering if we could have the base class extend Assert so that we don’t have to
    statically import it all the time? :-p
  • about tests modifying the data (as opposed to the configuration) un-doing the changes may
    be difficult, we would need some way to tell the mock setup to re-copy the data over and
    then force a “reset” of the resource pool
  • similarly for tests that do modify the configuration, it would be handy to have some way
    to tell the test setup to copy over the original configuration and just force a reload
  • about figuring out the coverage structure without having to open it… seems rather
    hard, wondering if we can get more mileage by asking tests to declare which data
    they do want, since most tests only use a handful of layers?
    There should still be a way to get the current default setup, just not by default

About the tests having to manually undo what they modified I have some concerns
in terms of broken builds, a developer might forget to do some of the rollbacks and
if different test classes end up sharing the same setup we might see OS/file system
dependent failures due to the different order in which the tests are running.
An approach we could consider is to use catalog/configuration/transaction listeners
to automatically mark resources that have been modified and roll them back
to their original state (not sure if it’s 100% feasible, but at least part of it could).

Btw, you have pretty long build times, on my almost 3 years old desktop machine I can
run the release config in significant less time than what you reported:

mvn clean install -Prelease -o
→ [INFO] Total time: 24 minutes 42 seconds
(full build times added at the end of the mail)

Still, a long time regardless, the build should be way faster than this, in 2.1.x
the build is roughly taking half of the time.

Wondering if you have any disk indexing service running that slows down the build?

Cheers
Andrea

PS: here is the full report with build times:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] ------------------------------------------------------------------------
[INFO] GeoServer … SUCCESS [4.652s]
[INFO] Core Platform Module … SUCCESS [3.024s]
[INFO] Open Web Service Module … SUCCESS [6.283s]
[INFO] Main Module … SUCCESS [2:00.991s]
[INFO] Web Feature Service Module … SUCCESS [2:33.199s]
[INFO] Web Coverage Service Module … SUCCESS [2.121s]
[INFO] Web Map Service Module … SUCCESS [2:36.512s]
[INFO] GeoServer Web Modules … SUCCESS [0.058s]
[INFO] Core UI Module … SUCCESS [1:09.153s]
[INFO] Security UI Module … SUCCESS [1:52.482s]
[INFO] GeoServer Security Modules … SUCCESS [0.166s]
[INFO] GeoServer CAS Security Module … SUCCESS [13.768s]
[INFO] GeoServer JDBC Security Module … SUCCESS [2:07.383s]
[INFO] GeoServer LDAP Security Module … SUCCESS [2.674s]
[INFO] Web Coverage Service 1.0 Module … SUCCESS [32.144s]
[INFO] Web Coverage Service 1.1 Module … SUCCESS [34.731s]
[INFO] GeoWebCache (GWC) Module … SUCCESS [34.645s]
[INFO] REST Support Module … SUCCESS [12.459s]
[INFO] REST Configuration Service Module … SUCCESS [2:21.622s]
[INFO] WMS UI Module … SUCCESS [32.537s]
[INFO] GWC UI Module … SUCCESS [34.326s]
[INFO] WFS UI Module … SUCCESS [13.353s]
[INFO] Demoes Module … SUCCESS [24.569s]
[INFO] WCS UI Module … SUCCESS [15.648s]
[INFO] GeoServer Web Application … SUCCESS [30.243s]
[INFO] Community Space … SUCCESS [0.108s]
[INFO] GeoServer Extensions … SUCCESS [0.116s]
[INFO] Application Schema Support … SUCCESS [0.097s]
[INFO] Application Schema Integration Test … SUCCESS [1:22.845s]
[INFO] Sample DataAccess Integration Test … SUCCESS [11.009s]
[INFO] ArcSDE DataStore Extension … SUCCESS [13.157s]
[INFO] GeoSearch Index Module … SUCCESS [13.699s]
[INFO] H2 DataStore Extension … SUCCESS [7.568s]
[INFO] SQL Server DataStore Extension … SUCCESS [0.180s]
[INFO] Oracle DataStore Extension … SUCCESS [0.119s]
[INFO] MySQL DataStore Extension … SUCCESS [0.187s]
[INFO] DB2 DataStore Extension … SUCCESS [0.196s]
[INFO] ImageMap Output Format … SUCCESS [6.331s]
[INFO] ImageI/O-Ext GDAL Coverage Extension … SUCCESS [0.702s]
[INFO] JP2K Coverage Extension … SUCCESS [0.213s]
[INFO] ogr2ogr Output Format … SUCCESS [13.169s]
[INFO] Excel Output Format … SUCCESS [15.366s]
[INFO] Validation Module … SUCCESS [1.686s]
[INFO] Chart external graphics support … SUCCESS [0.293s]
[INFO] Feature Generalization Extension … SUCCESS [1.608s]
[INFO] Image Mosaic JDBC Extension … SUCCESS [0.188s]
[INFO] OWS request flow controller … SUCCESS [9.206s]
[INFO] Web Processing Service parent … SUCCESS [0.062s]
[INFO] Web Processing Service Module … SUCCESS [1:18.758s]
[INFO] Web Processing Service GUI … SUCCESS [17.917s]
[INFO] GeoServer Layer Querying filter functions … SUCCESS [20.420s]
[INFO] Teradata DataStore Extension … SUCCESS [1.853s]
[INFO] GeoServer Release Module … SUCCESS [2.357s]
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24 minutes 42 seconds
[INFO] Finished at: Mon Sep 17 10:00:15 CEST 2012
[INFO] Final Memory: 93M/306M
[INFO] ------------------------------------------------------------------------

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Mon, Sep 17, 2012 at 2:18 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Mon, Sep 17, 2012 at 5:10 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,

A while back I sent an email about the state of GeoServer testing, and how at current the full unit test suite takes a lot of time to run. Here is the email for reference.

http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg17055.html

I have been pushing on some of the ideas there and started a proposal which can be found here:

http://geoserver.org/display/GEOS/GSIP+80±+Testing+Overhaul

By no means is the proposal in a complete state but I thought it would be beneficial to have something to help focus discussion around.

Feedback welcome. I hope to bring it up as a topic of discussion at the bi-weekly skype meeting but understand it is pretty short notice for people to review before then.

Hi,
had a quick look, wow, long proposal.

Random remarks:

  • generally speaking it’s looking good
  • even when running tests that do not alter the configuration during their execution
    we often add in the one time setup some extra layer, so I’m not sure how these
    would play against the idea of sharing tests in the same group

Good point. So yeah, to pull this off we would group literally only those tests that make no modifications to the setup whatsoever, which I think is still a pretty big group.

  • how would MockTest work, would it just fill an in memory catalog without
    any writing on disk?

That is the idea, that the tests would operate from a purely mocked up Catalog, GeoServer, etc… with no actual system or resources, meaning no spring context. The problem is that often tests need the live data, which is harder to mock up. So as written now the mock objects do actually create live datastores.

For coverages the problem is worse because coverage setup is actually much harder than the vector side and really relies heavily on actually reading the coverage to configure its config object. So not sure the way forward here. At a minimum i would wan tot have this configuration (copying over of data files, creation of datastores) happen lazily so that test cases that don’t require live data don’t have to pay the price for its setup.

  • either JUnit4 and TestNG work for me, I’m happy to use their extra when I need them
    and grumpy that I have to write more code to use them in the normal case
    (having to write the annotations instead of using name based conventions, add the static import)
    Wondering if we could have the base class extend Assert so that we don’t have to
    statically import it all the time? :-p

Yeah, i guess it could. Current as written the new test classes are not super focused on maintaining compatibility with the old style test cases. But that was really just to focus on the new stuff. I would like to add some methods that help to make the transition more transparent, and indeed having the base class extend Assert would help.

  • about tests modifying the data (as opposed to the configuration) un-doing the changes may
    be difficult, we would need some way to tell the mock setup to re-copy the data over and
    then force a “reset” of the resource pool

Yeah, revertLayer() does this with the current implementation. And I believe the final save of the feature type should cause the ResourcePool to throw away the cached feature type object?

  • similarly for tests that do modify the configuration, it would be handy to have some way
    to tell the test setup to copy over the original configuration and just force a reload

The idea is that revertLayer will do both config and data to keep calls to it relatively simple rather than provide a more complex interface to it. I have found currently that the overhead of restoring both data and configuration is generally pretty minuscule compared to firing up an entire new test setup.

  • about figuring out the coverage structure without having to open it… seems rather
    hard, wondering if we can get more mileage by asking tests to declare which data
    they do want, since most tests only use a handful of layers?
    There should still be a way to get the current default setup, just not by default

Agreed. And the idea is to do it lazily, at least for the mock tests. Not sure if this is possible for the system tests.

About the tests having to manually undo what they modified I have some concerns
in terms of broken builds, a developer might forget to do some of the rollbacks and
if different test classes end up sharing the same setup we might see OS/file system
dependent failures due to the different order in which the tests are running.
An approach we could consider is to use catalog/configuration/transaction listeners
to automatically mark resources that have been modified and roll them back
to their original state (not sure if it’s 100% feasible, but at least part of it could).

Yeah, also to complicate matters tests are often run in different order depending on the environment, for instance eclipse vs maven so this can lead to issues. This is certainly something to be weary of if we are going to push on this.

That said I think your idea about listeners help to make this transparent makes sense. Actually i while back i played around a “Transactional” catalog of sorts. Actually not really transactional, more of a “recording” catalog, capable of recording (in order) all the writes to the catalog, with the ability to play the same modifications in reverse essentially rolling them back. Might be worth trying to resurrect that.

Btw, you have pretty long build times, on my almost 3 years old desktop machine I can
run the release config in significant less time than what you reported:

mvn clean install -Prelease -o
→ [INFO] Total time: 24 minutes 42 seconds
(full build times added at the end of the mail)

Still, a long time regardless, the build should be way faster than this, in 2.1.x
the build is roughly taking half of the time.

Wondering if you have any disk indexing service running that slows down the build?

Hmmm… no i disabled the “locate” service and i don’t know of much else. I have noticed issues with my machine (mostly due to overheating) and I suspect the hard drive is the culprit so that could be factoring into that. Anyways, good to know others aren’t seeing as extreme a case as mine.

Cheers
Andrea

PS: here is the full report with build times:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] ------------------------------------------------------------------------
[INFO] GeoServer … SUCCESS [4.652s]
[INFO] Core Platform Module … SUCCESS [3.024s]
[INFO] Open Web Service Module … SUCCESS [6.283s]
[INFO] Main Module … SUCCESS [2:00.991s]
[INFO] Web Feature Service Module … SUCCESS [2:33.199s]
[INFO] Web Coverage Service Module … SUCCESS [2.121s]
[INFO] Web Map Service Module … SUCCESS [2:36.512s]
[INFO] GeoServer Web Modules … SUCCESS [0.058s]
[INFO] Core UI Module … SUCCESS [1:09.153s]
[INFO] Security UI Module … SUCCESS [1:52.482s]
[INFO] GeoServer Security Modules … SUCCESS [0.166s]
[INFO] GeoServer CAS Security Module … SUCCESS [13.768s]
[INFO] GeoServer JDBC Security Module … SUCCESS [2:07.383s]
[INFO] GeoServer LDAP Security Module … SUCCESS [2.674s]
[INFO] Web Coverage Service 1.0 Module … SUCCESS [32.144s]
[INFO] Web Coverage Service 1.1 Module … SUCCESS [34.731s]
[INFO] GeoWebCache (GWC) Module … SUCCESS [34.645s]
[INFO] REST Support Module … SUCCESS [12.459s]
[INFO] REST Configuration Service Module … SUCCESS [2:21.622s]
[INFO] WMS UI Module … SUCCESS [32.537s]
[INFO] GWC UI Module … SUCCESS [34.326s]
[INFO] WFS UI Module … SUCCESS [13.353s]
[INFO] Demoes Module … SUCCESS [24.569s]
[INFO] WCS UI Module … SUCCESS [15.648s]
[INFO] GeoServer Web Application … SUCCESS [30.243s]
[INFO] Community Space … SUCCESS [0.108s]
[INFO] GeoServer Extensions … SUCCESS [0.116s]
[INFO] Application Schema Support … SUCCESS [0.097s]
[INFO] Application Schema Integration Test … SUCCESS [1:22.845s]
[INFO] Sample DataAccess Integration Test … SUCCESS [11.009s]
[INFO] ArcSDE DataStore Extension … SUCCESS [13.157s]
[INFO] GeoSearch Index Module … SUCCESS [13.699s]
[INFO] H2 DataStore Extension … SUCCESS [7.568s]
[INFO] SQL Server DataStore Extension … SUCCESS [0.180s]
[INFO] Oracle DataStore Extension … SUCCESS [0.119s]
[INFO] MySQL DataStore Extension … SUCCESS [0.187s]
[INFO] DB2 DataStore Extension … SUCCESS [0.196s]
[INFO] ImageMap Output Format … SUCCESS [6.331s]
[INFO] ImageI/O-Ext GDAL Coverage Extension … SUCCESS [0.702s]
[INFO] JP2K Coverage Extension … SUCCESS [0.213s]
[INFO] ogr2ogr Output Format … SUCCESS [13.169s]
[INFO] Excel Output Format … SUCCESS [15.366s]
[INFO] Validation Module … SUCCESS [1.686s]
[INFO] Chart external graphics support … SUCCESS [0.293s]
[INFO] Feature Generalization Extension … SUCCESS [1.608s]
[INFO] Image Mosaic JDBC Extension … SUCCESS [0.188s]
[INFO] OWS request flow controller … SUCCESS [9.206s]
[INFO] Web Processing Service parent … SUCCESS [0.062s]
[INFO] Web Processing Service Module … SUCCESS [1:18.758s]
[INFO] Web Processing Service GUI … SUCCESS [17.917s]
[INFO] GeoServer Layer Querying filter functions … SUCCESS [20.420s]
[INFO] Teradata DataStore Extension … SUCCESS [1.853s]
[INFO] GeoServer Release Module … SUCCESS [2.357s]
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24 minutes 42 seconds
[INFO] Finished at: Mon Sep 17 10:00:15 CEST 2012
[INFO] Final Memory: 93M/306M
[INFO] ------------------------------------------------------------------------

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.