[Geoserver-devel] Twisted on wfsv testing needs

Hi,
I was starting to work on a wfsv testing base class that could
use a pre-cooked data dir but I'm having second thoughts, so I
guess I better discuss this with the other developers.

An alternative solution would be to each MockData to add a generic
datastore reference and a generic feature type reference, by
passing connection params to the first, and a map of informations
sufficient to build info.xml for the second (things like crs,
bounds, crs handling and so on).

With that one in place, one could conceivably:
* have a property file with the connection params to a dbms
* load it, see if it's possible to connect to the db. If not,
   skip the test
* if yes, execute enough sql calls to setup the db for the
   test
* register the datastore and the feature types in the mock
* run the test as usual

Seems cleaner than having a pre cooked data dir because the
test is in control, does not have to copy the data dir
around and parse catalog.xml to grab the infos.

Yet, building a data dir programmaticaly _may_ be quite a bit
of work whilst a pre-cooked one can be setup using the
geoserver user interface.

I'm lending more towards the first option now, but I'd
like to hear other people opinions.

Cheers
Andrea

Hi Andrea,
I like the first approach, though what about the following inline comments:

On Friday 21 December 2007 10:48:55 am Andrea Aime wrote:

...With that one in place, one could conceivably:
* have a property file with the connection params to a dbms

It'd be nice if the property file with ds connection parameters is located at,
and looked up like for the geotools online tests that use a "fixture".

* load it, see if it's possible to connect to the db. If not,
   skip the test

So instead, if the fixture exists try to connect, if can't connect fail,
because the fixture existing means you expect the test to pass. If there're
no fixture skip the test.

* if yes, execute enough sql calls to setup the db for the
   test

this one I'm not that sure, but what about using plain DataStore api to
populate it? wouldn't using plain sql be too low level that you'll end up
tied to a single db type? After all the DataStore working correctly is a
pre-requisite for the test to run.

* register the datastore and the feature types in the mock

Just that this way the whole thing is no more a "mock"? I'd be happy if we
call the test "functional" somehow. When the mock config has born, I used
properties datastore just for convenience, but would have been the same using
memory datastore. The point were to have a reliable DataStore to work over as
to not worry about the datastore working well or not, and concentrate on
testing geoserver functionality. So it seemed like a reasonable mock up of a
geoserver config, except for a purist. Throwing in real stuff into the test
config breaks its mock nature even more, which is fine since the test is
functional.

* run the test as usual

My 2c.-

Gabriel

Gabriel Roldán ha scritto:

Hi Andrea,
I like the first approach, though what about the following inline comments:

On Friday 21 December 2007 10:48:55 am Andrea Aime wrote:

...With that one in place, one could conceivably:
* have a property file with the connection params to a dbms

It'd be nice if the property file with ds connection parameters is located at, and looked up like for the geotools online tests that use a "fixture".

* load it, see if it's possible to connect to the db. If not,
   skip the test

So instead, if the fixture exists try to connect, if can't connect fail, because the fixture existing means you expect the test to pass. If there're no fixture skip the test.

Hum, the whole fixture concept seems to be either badly implemented
or badly documented in GeoTools, or else I never found docs about it.
In postgis datastore for example the property files are looked up in the
test module itself, and I've always modifed them locally, and it's still
working. So the fixture is always there, but you may not be able to connect using it.

* if yes, execute enough sql calls to setup the db for the
   test

this one I'm not that sure, but what about using plain DataStore api to populate it? wouldn't using plain sql be too low level that you'll end up tied to a single db type? After all the DataStore working correctly is a pre-requisite for the test to run.

Because createSchema is badly implemented (when implemented at all)
in most datastores. So if I used it I would end up debugging feature
type creation code instead of doing what I'm supposed to do (test wfsv).

* register the datastore and the feature types in the mock

Just that this way the whole thing is no more a "mock"? I'd be happy if we call the test "functional" somehow. When the mock config has born, I used properties datastore just for convenience, but would have been the same using memory datastore. The point were to have a reliable DataStore to work over as to not worry about the datastore working well or not, and concentrate on testing geoserver functionality. So it seemed like a reasonable mock up of a geoserver config, except for a purist. Throwing in real stuff into the test config breaks its mock nature even more, which is fine since the test is functional.

Hem... we're splitting hairs here imho.
Yes you're right, MockData is not really a mock, it's a tool to create a full data directory.
Yes you're right, one should use a minimal data store for testing,
but at the moment there is no such a thing for versioning (want to
write a versioning property datastore) so I need the real thing.
So in the end, in order to concentrate on testing GeoServer I need
the versioning postgis datastore anyways, or I'll end up doing no
tests whatsoever. Can you suggest alternate routes (with reasonable
development times related to them)?

Cheers
Andrea

On Friday 21 December 2007 05:31:30 pm Andrea Aime wrote:

Gabriel Roldán ha scritto:
> Hi Andrea,
> I like the first approach, though what about the following inline
> comments:
>
> On Friday 21 December 2007 10:48:55 am Andrea Aime wrote:
>> ...With that one in place, one could conceivably:
>> * have a property file with the connection params to a dbms
>
> It'd be nice if the property file with ds connection parameters is
> located at, and looked up like for the geotools online tests that use a
> "fixture".
>
>> * load it, see if it's possible to connect to the db. If not,
>> skip the test
>
> So instead, if the fixture exists try to connect, if can't connect fail,
> because the fixture existing means you expect the test to pass. If
> there're no fixture skip the test.

Hum, the whole fixture concept seems to be either badly implemented
or badly documented in GeoTools, or else I never found docs about it.
In postgis datastore for example the property files are looked up in the
test module itself, and I've always modifed them locally, and it's still
working. So the fixture is always there, but you may not be able to
connect using it.

Well I seem to recall making use of fixtures and having them working, though
the memory is vague and thing may have changed...

>> * if yes, execute enough sql calls to setup the db for the
>> test
>
> this one I'm not that sure, but what about using plain DataStore api to
> populate it? wouldn't using plain sql be too low level that you'll end up
> tied to a single db type? After all the DataStore working correctly is a
> pre-requisite for the test to run.

Because createSchema is badly implemented (when implemented at all)
in most datastores. So if I used it I would end up debugging feature
type creation code instead of doing what I'm supposed to do (test wfsv).

aha, fair enough, didn't think on it

>> * register the datastore and the feature types in the mock
>
> Just that this way the whole thing is no more a "mock"? I'd be happy if
> we call the test "functional" somehow. When the mock config has born, I
> used properties datastore just for convenience, but would have been the
> same using memory datastore. The point were to have a reliable DataStore
> to work over as to not worry about the datastore working well or not, and
> concentrate on testing geoserver functionality. So it seemed like a
> reasonable mock up of a geoserver config, except for a purist. Throwing
> in real stuff into the test config breaks its mock nature even more,
> which is fine since the test is functional.

Hem... we're splitting hairs here imho.
Yes you're right, MockData is not really a mock, it's a tool to create a
full data directory.
Yes you're right, one should use a minimal data store for testing,
but at the moment there is no such a thing for versioning (want to
write a versioning property datastore) so I need the real thing.
So in the end, in order to concentrate on testing GeoServer I need
the versioning postgis datastore anyways, or I'll end up doing no
tests whatsoever. Can you suggest alternate routes (with reasonable
development times related to them)?

yes, actually I was just trying to propose the simplest change possible, a
naming one. Just to not get things confused. I'd say go for it as you
proposed but change Mock by something else. MockData->FunctionalTestData?
Me neither want to spend a lot of time on it at least we have a mandate, aka
working hours to dedicate.

Cheers,

Gabriel

Cheers
Andrea

!DSPAM:4045,476bea67291158362916074!

Just so I am clear Andrea, does this mean that the regular "mock data"
feature types and datastores would be loaded as well. And then the test
case would just call method passing in the datastore params?

Not sure how i feel about this... indeed it seems like it is a simpler
change when the previous proposal... however i liked the idea of
factoring out a "GeoServerTestData" base class and adding subclasses:

"MockGeoServerTestData"
"LiveGeoServerTestData"
"ProgrammaticGeoServerTestData"

etc... Allowing the test to chose which implementation is used. I think
it has the ability to lead to a lot less setup of tests... as the tests
that run "live" dont need to build all the mock types"

Another thing i like about the "live" data idea as opposed to the
"programmatic" one is it actually gives us a way to test the data
directories that actually get shipped, as opposed to just ones that we
cook up ourselves.

My 2c

-Justin

Andrea Aime wrote:

Hi,
I was starting to work on a wfsv testing base class that could
use a pre-cooked data dir but I'm having second thoughts, so I
guess I better discuss this with the other developers.

An alternative solution would be to each MockData to add a generic
datastore reference and a generic feature type reference, by
passing connection params to the first, and a map of informations
sufficient to build info.xml for the second (things like crs,
bounds, crs handling and so on).

With that one in place, one could conceivably:
* have a property file with the connection params to a dbms
* load it, see if it's possible to connect to the db. If not,
   skip the test
* if yes, execute enough sql calls to setup the db for the
   test
* register the datastore and the feature types in the mock
* run the test as usual

Seems cleaner than having a pre cooked data dir because the
test is in control, does not have to copy the data dir
around and parse catalog.xml to grab the infos.

Yet, building a data dir programmaticaly _may_ be quite a bit
of work whilst a pre-cooked one can be setup using the
geoserver user interface.

I'm lending more towards the first option now, but I'd
like to hear other people opinions.

Cheers
Andrea

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:4007,476b8c21188273327367457!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

Justin Deoliveira ha scritto:

Just so I am clear Andrea, does this mean that the regular "mock data"
feature types and datastores would be loaded as well. And then the test
case would just call method passing in the datastore params?

No, it does not mean that. There is a method called populateDataDirectory that can be overridden by subclasses.
By default it adds all the well known types used in wfs.
But in wcs it does not, and it adds my testing coverages instead.
So you can have whatever you want.

Not sure how i feel about this... indeed it seems like it is a simpler
change when the previous proposal... however i liked the idea of
factoring out a "GeoServerTestData" base class and adding subclasses:

"MockGeoServerTestData"
"LiveGeoServerTestData"
"ProgrammaticGeoServerTestData"

etc... Allowing the test to chose which implementation is used. I think
it has the ability to lead to a lot less setup of tests... as the tests
that run "live" dont need to build all the mock types"

Another thing i like about the "live" data idea as opposed to the
"programmatic" one is it actually gives us a way to test the data
directories that actually get shipped, as opposed to just ones that we
cook up ourselves.

Hum, why do you want to make unit tests against them?
I don't know, it's just that using a live one makes it harder to know
if you are in conditions to run the tests or not.

Cheers
Andrea

Another thing i like about the "live" data idea as opposed to the
"programmatic" one is it actually gives us a way to test the data
directories that actually get shipped, as opposed to just ones that we
cook up ourselves.

Hum, why do you want to make unit tests against them?
I don't know, it's just that using a live one makes it harder to know
if you are in conditions to run the tests or not.

To test against the data we ship GeoServer with. One good case I can
think of is testing all the sample requests so that we know when they
break. Or perhaps testing out the georss and gmaps demos... just an
idea, perhaps not a good one.

About knowing wether the test should run or not... I think it would be
pretty easy to turn these tests off. We could add a method to the
"GeoServerTestData" base class called "isAvailble()", and if it returned
false skip tests. In the "live" case it would simply test if the data
directory set actually exists.

Cheers
Andrea

!DSPAM:4007,476c0b3313731336712104!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

Justin Deoliveira ha scritto:
...

About knowing wether the test should run or not... I think it would be
pretty easy to turn these tests off. We could add a method to the
"GeoServerTestData" base class called "isAvailble()", and if it returned
false skip tests. In the "live" case it would simply test if the data
directory set actually exists.

Yeah, that's exactly where I'm loosing you. So far I've seen two ideas
of live data dir:
* the one I proposed, using a data dir available as a resource in
   the classpath. It's always there, but you may not have the external
   data stores to connect to
* the one you propose, the release config, which again is always
   there (if you check out from the root at least)

It seems to me the isAvailable method should be coded against the
availability of datastores, not against the availability of data dirs.
What am I missing?
Cheers
Andrea

Yeah, that's exactly where I'm loosing you. So far I've seen two ideas
of live data dir:
* the one I proposed, using a data dir available as a resource in
  the classpath. It's always there, but you may not have the external
  data stores to connect to
* the one you propose, the release config, which again is always
  there (if you check out from the root at least)

It seems to me the isAvailable method should be coded against the
availability of datastores, not against the availability of data dirs.
What am I missing?
Cheers
Andrea

Ok I am lost now too... But sure.. isAvailable() could walk through the
datastores and check if they are all available. As for always being
there i guess i envisioned this being something that we want to easily
turn off and probably only run on the build server. maybe not though...

!DSPAM:4007,476cdb73230111637810514!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

Justin Deoliveira ha scritto:

Yeah, that's exactly where I'm loosing you. So far I've seen two ideas
of live data dir:
* the one I proposed, using a data dir available as a resource in
  the classpath. It's always there, but you may not have the external
  data stores to connect to
* the one you propose, the release config, which again is always
  there (if you check out from the root at least)

It seems to me the isAvailable method should be coded against the
availability of datastores, not against the availability of data dirs.
What am I missing?
Cheers
Andrea

Ok I am lost now too... But sure.. isAvailable() could walk through the
datastores and check if they are all available.

Another detail I was a little twisted about was how to handle
the data dir modifications:
* do we assume these live tests do not alter the data or the config,
   and thus we run them against the data dir as is or
* we deep copy whatever it's in there into a temp location before
   starting (that might be quite slow, the release configuration is 12MB
   along with .svn dir, so it's probably 6MB if we avoid them)

As for always being
there i guess i envisioned this being something that we want to easily
turn off and probably only run on the build server. maybe not though...

You mean these kind of tests should be run only when a special "extra tests" flag is raised?
Cheers
Andrea

Another detail I was a little twisted about was how to handle
the data dir modifications:
* do we assume these live tests do not alter the data or the config,
  and thus we run them against the data dir as is or
* we deep copy whatever it's in there into a temp location before
  starting (that might be quite slow, the release configuration is 12MB
  along with .svn dir, so it's probably 6MB if we avoid them)

I guess this is really the same problem that we have now with property
datastore, and knowing when to rewrite the property files.

I wonder... What about a checksum scheme to detect when files change.
Could be applied to both cases and might give us the performance boost
we need. Doing some googling yields a library called "jacksum". Should
be easy enough to try out with our current unit tests to see if its
feasible.

As for always being
there i guess i envisioned this being something that we want to easily
turn off and probably only run on the build server. maybe not though...

You mean these kind of tests should be run only when a special "extra
tests" flag is raised?

That is what i was thinking... sort of how "online" or "stress" tests
work in geotools.

Cheers
Andrea

!DSPAM:4007,476d8d6f66353327367457!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

Justin Deoliveira ha scritto:

Another detail I was a little twisted about was how to handle
the data dir modifications:
* do we assume these live tests do not alter the data or the config,
  and thus we run them against the data dir as is or
* we deep copy whatever it's in there into a temp location before
  starting (that might be quite slow, the release configuration is 12MB
  along with .svn dir, so it's probably 6MB if we avoid them)

I guess this is really the same problem that we have now with property
datastore, and knowing when to rewrite the property files.

I wonder... What about a checksum scheme to detect when files change.
Could be applied to both cases and might give us the performance boost
we need. Doing some googling yields a library called "jacksum". Should
be easy enough to try out with our current unit tests to see if its
feasible.

Hmmm.... that's not really what I meant. I was speaking about avoiding
the deep copy to start with, that is, to work with the original version
of the data dir.
A checksum scheme assumes you've already done a deep copy and you have
to decide whether to keep it or replace it back with the original one
(totally or partially).

My point was that working against a copy of the data dir requires a
deep copy of it and that requires quite some time to start with.

Cheers
Andrea