[GRASS-dev] G7: turning manual page examples into test cases

Hi,

in the recent past I added a series of examples to various manual
pages. Most of them might qualify for (basic) testing of the
respective command.
Since it is a bit time consuming to write these standard tests
manually, would there be a chance to develop a test case generator
e.g. driven by a template?

Markus

On Wed, Nov 26, 2014 at 10:44 AM, Markus Neteler <neteler@osgeo.org> wrote:

Hi,

in the recent past I added a series of examples to various manual
pages. Most of them might qualify for (basic) testing of the
respective command.

I totally agree. I have the same idea in my mind for some time and I would
really like to see use it but I had no time to implement it.

For basic functionality can be implemented quite simply since the testing
framework supports bash/sh files. You extract the code from the manual page
into a .sh file, put all these files into proper directories (the are more
options what proper is), and then you can run test running script on this
directory structure instead of the source code. However, this approach will
not work on MS Windows but there are some possible adjustments (for
example, different language tabs and/or (limited) automatic conversion [1]).

What is the challenge and what requires some thinking is how to test the
results and what to do with d.* commands. Not test the results and remove
d.* commands seems like a good option for start.

This would be a nice analogy to Python doctest which also points us towards
the fact that main purpose of this (and doctest) should be testing of
documentation rather then using documentation for testing (or instead of
testing). In other words, I think that purpose of this is to find out if
the examples are still valid (modules and options exist and map names are
not wrong). As a bonus this will test if the module starts and in some
cases it might test if the expected maps were created.

Since it is a bit time consuming to write these standard tests
manually, would there be a chance to develop a test case generator
e.g. driven by a template?

Can you tell more about your idea? It seems much more advanced than the

basic approach I'm suggesting.

Thanks for bringing this up,
Vashek

[1] http://fatra.cnr.ncsu.edu/temporal-grass-workshop/

Markus

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

This almost sound like the PostGIS Garden Test: http://trac.osgeo.org/postgis/wiki/DevWikiGardenTest

Robert Moskovitz
California Geological Survey
Seismic Hazards Zonation Program

CONFIDENTIALITY NOTICE: This communication is intended only for the use of the individual or entity to which it is addressed. This message contains information from the State of California, California Geological Survey, which may be privileged, confidential and exempt from disclosure under applicable law, including the Electronic Communications Privacy Act. If the reader of this communication is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited.

-----Original Message-----
From: grass-dev-bounces@lists.osgeo.org [mailto:grass-dev-bounces@lists.osgeo.org] On Behalf Of Markus Neteler
Sent: Wednesday, November 26, 2014 7:45 AM
To: GRASS developers list
Subject: [GRASS-dev] G7: turning manual page examples into test cases

Hi,

in the recent past I added a series of examples to various manual
pages. Most of them might qualify for (basic) testing of the
respective command.
Since it is a bit time consuming to write these standard tests
manually, would there be a chance to develop a test case generator
e.g. driven by a template?

Markus
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Hi,
thats a nice idea indeed.

However, such auto-generated tests can only verify if the called
modules can be executed and if they return 0.
They can not replace tests that verify the result of the processing or
correct error handling.

The next problem is how to distinguish between a module/shell-command
and the generated output in the manual page? I have plenty of examples
in the temporal module manual pages that provide the stdout/stderr
output of the called module in the code section.

Hence, it would be meaningful to mark code section in the manual page
that can be used for testing purposes.

Best regards
Soeren

2014-11-26 16:44 GMT+01:00 Markus Neteler <neteler@osgeo.org>:

Hi,

in the recent past I added a series of examples to various manual
pages. Most of them might qualify for (basic) testing of the
respective command.
Since it is a bit time consuming to write these standard tests
manually, would there be a chance to develop a test case generator
e.g. driven by a template?

Markus
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On Thu, Nov 27, 2014 at 8:19 AM, Sören Gebbert <soerengebbert@googlemail.com

wrote:

The next problem is how to distinguish between a module/shell-command
and the generated output in the manual page? I have plenty of examples
in the temporal module manual pages that provide the stdout/stderr
output of the called module in the code section.

Hence, it would be meaningful to mark code section in the manual page
that can be used for testing purposes.

I agree with more markup in manual pages for these purposes. Either
different class or some HTML5 tag.

On Wed, Nov 26, 2014 at 12:30 PM, Moskovitz, Bob@DOC <
Bob.Moskovitz@conservation.ca.gov> wrote:

This almost sound like the PostGIS Garden Test:
http://trac.osgeo.org/postgis/wiki/DevWikiGardenTest

Thanks for the link. This is also a good idea, although I'm not sure how
much it is applicable to GRASS GIS modules.

As they say, it is (close to) monkey testing [1] where you input random
data into your algorithms and look if they fail. They get the information
about algorithms from documentation. Speaking about modules, we actually
have better mechanisms how to get the algorithms and their interfaces (list
modules on path or in GUI toolbox). However, this would apply to C and
Python interfaces.

Now, I would focus more on the current topic: the tests extracted from
examples in documentation. Examples are using the basic functionality and
have nice data, so it should not be hard to pass the tests.

[1] http://en.wikipedia.org/wiki/Monkey_test

Hi,

I’ve created a Python script which extracts code from HTML file and puts it to a Bash script. It works well for our course websites but even with some fixes it does not work for GRASS HTML manual pages. It might work well for some modules, so you can try (both attached). Now, I have the following ideas in case you want to dive into it on your own.

The script should be turned into library function and generalized to work with different HTML-styles or even different markups (I think that something like template method design pattern should work).

However, it is necessary to distinguish code from other things in the HTML. We should look at HTML5 at that point to make the change right once we are doing it. Basically we need to make the HTML more semantic. Perhaps some additional classes or attributes will be also needed to mark code which should not be executed.

Taking this idea further, we might have different languages at the page such as Bash, Python and some platform-neutral command line. Pretty much (visually) what we have here:

http://fatra.cnr.ncsu.edu/temporal-grass-workshop/

This would also help with solving the issue of running Bash/Shell on MS Windows where just the Python and platform-neutral code would be executed (which is better than nothing). Soeren’s shell interpreter (or shell-to-python translator, http://grasswiki.osgeo.org/wiki/Test_Suite) is of course always an option if Bash will have to by used everywhere for lack of other tests and examples. But even this requires keeping some rules for writing examples, consider the following diff for example (“make bash command copy-paste-able to command line”):

— vector/v.net/v.net.html (revision 63739)
+++ vector/v.net/v.net.html (working copy)
@@ -170,10 +170,10 @@

[v.net](http://v.net) points=geodetic_swwake_pts output=geodetic_swwake_pts_net \
operation=arcs file=- << EOF
-> 1 28000 28005
-> 2 27945 27958
-> 3 27886 27897
-> EOF
+1 28000 28005
+2 27945 27958
+3 27886 27897
+EOF

Currently it seems to me that the way to go is running the tests generated from documentation only when requested. Requested in this case means writing a test file which would use some special API which would generate and execute the documentation-based test. This is the way how it works for Python doctest, too. The implementation would be on-the-fly genered bash script executed from the Python test. Alternative is each example turned into one test method but this wouldn’t work because the order of method execution is not preserved (by Python unittest). So, one HTML file must be one bash script.

The approach above allows module-specific filtering of examples to test and clear integration of this tests into module’s testsuite. The obvious disadvantaged of course is that one has to provide all the test files for each module but the alternative would actually be large number of failing tests due to strange (not necessarily wrong) documentation and a lot of failing tests by default is always wrong.

The approach with manual addition of tests has actually one great feature. Test can specify which maps should be checked against which references which can be provided (and checked) in the same way as for all the other scripts.

As this would not work for all modules unless explicitly requested, there is still some place for some kind of monkey testing which would be based just on the interface of the module. However, I would prefer not to have these tests which might be failing by default together with the standard hand-written tests. Anyway, there sometimes the number of errors might be lowered by just requesting the modules to fail properly with an error message (rather than segfaulting). Another interesting approach is to define the interface of modules so well that the test data can be generated in the way which in fact fulfills all the advanced requirements the module can have (e.g. raster values in range 0-255. This of course goes far into data provenance and software metadata topics.

Vaclav

(attachments)

script_to_extract_tests_from_course.py (4.95 KB)
script_to_extract_tests_from_course.py (4.95 KB)

Hi Vaclav,
many thanks at you for taking the initiative here. You pointing out a
nice and very interesting approach.

I fully agree with you that using the Python doctest approach in the
GRASS HTML module documentation is a great idea.
However, i am not a big fan of bash test scripts anymore, since we
have the beautiful gunittest suite. Hence, i would like
to suggest to use a specific syntax in the documentation to define
tests that will then be translated into the according gunittest tests
python code.

Tests should be marked in the HTML code as such using a comment in the
division block that has the key word TEST in it:

{{{
<div class"code">
<-- usable as TEST required LOCATION nc -->
<pre>
...
</pre>
</div>
}}}

Modules that should be tested are marked as such by bash comments
using 3 hash tags and pre-/post-process modules using 2 hash tags.
Here the example code of a r.slope.aspect call in the div block
including pre- and post-processing as well as key-value output check:

{{{
## Some pre-processing
g.region rast=elevation

### Here we call the module
r.slope.aspect elevation=elevation aspect=aspect_new slope=slope_new

## Lets have a look at the created raster map by generating key-value output
r.info -g map=aspect_new
north=228500
south=215000
east=645000
west=630000
nsres=10
ewres=10
rows=1350
cols=1500
cells=2025000
datatype=FCELL
ncats=360

}}}

The shell code can still be copy and pasted into a shell for execution.

The HTML parser will be able to detect module calls by analyzing the
comments before a command. It is required that all module options are
written as fully qualified names without abbreviation.
This will allow us to use the gunittest framework to create the python
test code:

The parser detects the module name and one option (this should be very
simple to implement) and creates such python code:
{{{
import grass.gunittest as gunit
module = gunit.gmodules.SimpleModule("g.region")
module.inputs["raster"].value = "elevation"
gunit.TestCase.runModule(module)
}}}

The same will be done for r.slope.aspect, but this module should be
tested, since it is introduced by 3 hash tags:
{{{
import grass.gunittest as gunit
module = gunit.gmodules.SimpleModule("r.slope.aspect")
module.inputs["elevation"].value = "elevation"
module.outputs["aspect"].value = "aspect_new"
module.outputs["slope"].value = "slope_new"
gunit.TestCase.assertModule(module)
}}}

The module r.info with key value check:
{{{
import grass.gunittest as gunit
module = gunit.gmodules.SimpleModule("r.info")
module.inputs["map"].value = "aspect_new"
module.flags["g"].value = True
self.assertModuleKeyValue(module, reference=content)
}}}

The parser will find the key-value content right after the r.info call
(like in python doctest) and knows that this is the output of r.info
in key-value format, since the keyword "key-value" is present in the
comment section of the command.
Hence the output of r.info is checked against this content using
gunittest key-value test method.

IMHO this concept can be used to provide several gunittest functions
in the bash style examples. We can specify several keywords that can
appear in the
module comment section to specify ascii output, key-value output and
so on. In addition these kind of tests should also run on Windows.

This approach will indeed limit the usage of bash commands in the test
creation, but i think this is acceptable and it keeps the examples
better readable.

What do you think?

Best regards
Soeren

2014-12-28 6:25 GMT+01:00 Vaclav Petras <wenzeslaus@gmail.com>:

Hi,

I've created a Python script which extracts code from HTML file and puts it
to a Bash script. It works well for our course websites but even with some
fixes it does not work for GRASS HTML manual pages. It might work well for
some modules, so you can try (both attached). Now, I have the following
ideas in case you want to dive into it on your own.

The script should be turned into library function and generalized to work
with different HTML-styles or even different markups (I think that something
like template method design pattern should work).

However, it is necessary to distinguish code from other things in the HTML.
We should look at HTML5 at that point to make the change right once we are
doing it. Basically we need to make the HTML more semantic. Perhaps some
additional classes or attributes will be also needed to mark code which
should not be executed.

Taking this idea further, we might have different languages at the page such
as Bash, Python and some platform-neutral command line. Pretty much
(visually) what we have here:

http://fatra.cnr.ncsu.edu/temporal-grass-workshop/

This would also help with solving the issue of running Bash/Shell on MS
Windows where just the Python and platform-neutral code would be executed
(which is better than nothing). Soeren's shell interpreter (or
shell-to-python translator, http://grasswiki.osgeo.org/wiki/Test_Suite) is
of course always an option if Bash will have to by used everywhere for lack
of other tests and examples. But even this requires keeping some rules for
writing examples, consider the following diff for example ("make bash
command copy-paste-able to command line"):

--- vector/v.net/v.net.html (revision 63739)
+++ vector/v.net/v.net.html (working copy)
@@ -170,10 +170,10 @@
<div class="code"><pre>
v.net points=geodetic_swwake_pts output=geodetic_swwake_pts_net \
       operation=arcs file=- &lt;&lt; EOF
-> 1 28000 28005
-> 2 27945 27958
-> 3 27886 27897
-> EOF
+1 28000 28005
+2 27945 27958
+3 27886 27897
+EOF
</pre></div>

Currently it seems to me that the way to go is running the tests generated
from documentation only when requested. Requested in this case means writing
a test file which would use some special API which would generate and
execute the documentation-based test. This is the way how it works for
Python doctest, too. The implementation would be on-the-fly genered bash
script executed from the Python test. Alternative is each example turned
into one test method but this wouldn't work because the order of method
execution is not preserved (by Python unittest). So, one HTML file must be
one bash script.

The approach above allows module-specific filtering of examples to test and
clear integration of this tests into module's testsuite. The obvious
disadvantaged of course is that one has to provide all the test files for
each module but the alternative would actually be large number of failing
tests due to strange (not necessarily wrong) documentation and a lot of
failing tests by default is always wrong.

The approach with manual addition of tests has actually one great feature.
Test can specify which maps should be checked against which references which
can be provided (and checked) in the same way as for all the other scripts.

As this would not work for all modules unless explicitly requested, there is
still some place for some kind of monkey testing which would be based just
on the interface of the module. However, I would prefer not to have these
tests which might be failing by default together with the standard
hand-written tests. Anyway, there sometimes the number of errors might be
lowered by just requesting the modules to fail properly with an error
message (rather than segfaulting). Another interesting approach is to define
the interface of modules so well that the test data can be generated in the
way which in fact fulfills all the advanced requirements the module can have
(e.g. raster values in range 0-255. This of course goes far into data
provenance and software metadata topics.

Vaclav