[GRASS-dev] Upcoming 7.2.0: review which addons to move to core

On Tue, Oct 4, 2016 at 4:22 PM, Markus Metz
<markus.metz.giswork@gmail.com> wrote:

On Tue, Oct 4, 2016 at 5:42 PM, Sören Gebbert
<soerengebbert@googlemail.com> wrote:

Hi,

>
> You are very welcome to write the missing tests for core modules.
>
> However, i don't understand the argument that because many core modules
> have
> no tests, therefore new modules don't need them. If developers of addon
> module are serious about the attempt to make their modules usable and
> maintainable for others, then they have to implement tests. Its and
> integral
> part of the development process and GRASS has a beautiful test
> environment
> hat makes writing tests easy. Tests and documentation are part of coding
> and
> not something special. I don't think this is a hard requirement.
>
> There is a nice statement that is not far from the truth: Untested code
> is
> broken code.

these gunittests only test if a module output stays the same. This

This is simply wrong, please read the gunittest documentation.

but then why does

The gunittest for the v.stream.order addon is an example how its done:
https://trac.osgeo.org/grass/browser/grass-addons/grass7/vector/v.stream.order/testsuite/test_stream_order.py

assume certain order numbers for features 4 and 7? What if these order
numbers are wrong?

Recently I fixed bugs in r.stream.order, related to stream length
calculations which are in turn used to determine stream orders. The
gunittest did not pick up 1) the bugs, 2) the bug fixes.

You can write gunittests that will test every flag, every option, their
combination and any output of a module. I have implemented plenty of tests,
that check for correct error handling. Writing tests is effort, but you have
to do it anyway. Why not implementing a gunittest for every feature while
developing the module?

My guess for the r.stream.* modules is at least 40 man hours of
testing to make sure they work correctly. That includes evaluation of
float usage, handling of NULL data, comparison of results with and
without the -m flag. Testing should be done with both high-res (LIDAR)
and low-res (e.g. SRTM) DEMs.

Tests can be performed on artificial data that tests all aspects of the
algorithm. Tests that show the correctness of the algorithm for specific
small cases should be preferred. However, large data should not be an
obstacle to write a test.

I agree, for tests during development, not for gunittests.

From the examples I read, gunittests expect a specific output. If the
expected output (obtained with an assumed correct version of the
module) is wrong, the gunittest is bogus. gunittests are ok to make
sure the output does not change, but not ok to make sure the output is
correct. Two random examples are r.stream.order and r.univar.

I am not sure why are we discussing this, it's pretty obvious that
gunittests can serve to a) test inputs/outputs b) catch changes in
results (whether correct or incorrect) c) test correctness of results.
It just depends how you write them, and yes, for some modules c) is
more difficult to implement than for others.

Anna

Markus M

Best regards
Soeren

my2c

Markus M

>
> Best
> Sören
>
>>
>> One thing we could think about is activating the toolbox idea a bit
>> more
>> and creating a specific OBIA toolbox in addons.
>>
>>> Identified candidates could be added to core once they fulfill the
>>> requirements above. Would that happen only in minor releases or would
>>> that also be possible in point releases?
>>
>>
>> Adding modules to core is not an API change, so I don't see why they
>> can't
>> be added at any time. But then again, having a series of new modules
>> can be
>> sufficient to justify a new minor release :wink:
>>
>>> Or is that already too much formality and if someone wishes to see an
>>> addon in core that is simply discussed on ML?
>>
>>
>> Generally, I would think that discussion on ML is the best way to
>> handle
>> this.
>>
>> Moritz
>>
>>
>> _______________________________________________
>> grass-dev mailing list
>> grass-dev@lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/grass-dev
>
> _______________________________________________
> grass-dev mailing list
> grass-dev@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

2016-10-04 22:22 GMT+02:00 Markus Metz <markus.metz.giswork@gmail.com>:

On Tue, Oct 4, 2016 at 5:42 PM, Sören Gebbert
<soerengebbert@googlemail.com> wrote:
> Hi,
>>
>>
>> >
>> > You are very welcome to write the missing tests for core modules.
>> >
>> > However, i don't understand the argument that because many core
modules
>> > have
>> > no tests, therefore new modules don't need them. If developers of
addon
>> > module are serious about the attempt to make their modules usable and
>> > maintainable for others, then they have to implement tests. Its and
>> > integral
>> > part of the development process and GRASS has a beautiful test
>> > environment
>> > hat makes writing tests easy. Tests and documentation are part of
coding
>> > and
>> > not something special. I don't think this is a hard requirement.
>> >
>> > There is a nice statement that is not far from the truth: Untested
code
>> > is
>> > broken code.
>>
>> these gunittests only test if a module output stays the same. This
>
>
> This is simply wrong, please read the gunittest documentation.

but then why does
>
> The gunittest for the v.stream.order addon is an example how its done:
> https://trac.osgeo.org/grass/browser/grass-addons/grass7/
vector/v.stream.order/testsuite/test_stream_order.py

assume certain order numbers for features 4 and 7? What if these order
numbers are wrong?

The checked order numbers are validated by hand. The test example is based
on artificial data, that i have created, for which i know what the correct
order numbers are. Hence i can test if certain features have specific order
numbers, since i know the correct solution.

Recently I fixed bugs in r.stream.order, related to stream length
calculations which are in turn used to determine stream orders. The
gunittest did not pick up 1) the bugs, 2) the bug fixes.

Then better test implementations are required that checks for correct
output. If a bug was found a test should be written to check the bugfix.
Have a look at this commit that adds two new tests to validate the provided
bugfix:

https://trac.osgeo.org/grass/changeset/69586

A one line bugfix and 50 lines of test code. :slight_smile:

>
> You can write gunittests that will test every flag, every option, their
> combination and any output of a module. I have implemented plenty of
tests,
> that check for correct error handling. Writing tests is effort, but you
have
> to do it anyway. Why not implementing a gunittest for every feature while
> developing the module?
>>
>>
>> My guess for the r.stream.* modules is at least 40 man hours of
>> testing to make sure they work correctly. That includes evaluation of
>> float usage, handling of NULL data, comparison of results with and
>> without the -m flag. Testing should be done with both high-res (LIDAR)
>> and low-res (e.g. SRTM) DEMs.
>
>
> Tests can be performed on artificial data that tests all aspects of the
> algorithm. Tests that show the correctness of the algorithm for specific
> small cases should be preferred. However, large data should not be an
> obstacle to write a test.

I agree, for tests during development, not for gunittests.

From the examples I read, gunittests expect a specific output. If the
expected output (obtained with an assumed correct version of the
module) is wrong, the gunittest is bogus. gunittests are ok to make
sure the output does not change, but not ok to make sure the output is
correct. Two random examples are r.stream.order and r.univar.

I don't understand your argument here or i have a principal problem in
understanding the test topic.

You have to implement a test that checks for correct output, this is the
meaning of a test. You have to design a test scenario from which you know
what the correct solution is and then you test for the correct solution.
What is with r.univar? Create a test map with a specific number of cells
with specific values and test if r.univar is able to compute the correct
values that you have validated by hand.

-- what is the mean, min and max of 10 cells each with value 5? Its 5! --

The most simple check for that is the raster range check in gunittest. If
you know what the range of the resulting raster map has to be, then you can
test for it. If this is not enough then you can check against the
uni-variate statistic output of the raster map, since you know for sure
what the result is for min, mean, median, max and so on. If this is not
sufficient use r.out.ascii and check against the correct solution that you
have created beforehand. If this is not sufficient then use pygrass and
investigate each raster cell of the resulting output map.

Best regards
Soeren

Markus M

>
> Best regards
> Soeren
>
>>
>> my2c
>>
>> Markus M
>>
>> >
>> > Best
>> > Sören
>> >
>> >>
>> >> One thing we could think about is activating the toolbox idea a bit
>> >> more
>> >> and creating a specific OBIA toolbox in addons.
>> >>
>> >>> Identified candidates could be added to core once they fulfill the
>> >>> requirements above. Would that happen only in minor releases or
would
>> >>> that also be possible in point releases?
>> >>
>> >>
>> >> Adding modules to core is not an API change, so I don't see why they
>> >> can't
>> >> be added at any time. But then again, having a series of new modules
>> >> can be
>> >> sufficient to justify a new minor release :wink:
>> >>
>> >>> Or is that already too much formality and if someone wishes to see
an
>> >>> addon in core that is simply discussed on ML?
>> >>
>> >>
>> >> Generally, I would think that discussion on ML is the best way to
>> >> handle
>> >> this.
>> >>
>> >> Moritz
>> >>
>> >>
>> >> _______________________________________________
>> >> grass-dev mailing list
>> >> grass-dev@lists.osgeo.org
>> >> http://lists.osgeo.org/mailman/listinfo/grass-dev
>> >
>> > _______________________________________________
>> > grass-dev mailing list
>> > grass-dev@lists.osgeo.org
>> > http://lists.osgeo.org/mailman/listinfo/grass-dev
>
>

On Tue, Oct 4, 2016 at 11:02 PM, Sören Gebbert
<soerengebbert@googlemail.com> wrote:

2016-10-04 22:22 GMT+02:00 Markus Metz <markus.metz.giswork@gmail.com>:

Recently I fixed bugs in r.stream.order, related to stream length
calculations which are in turn used to determine stream orders. The
gunittest did not pick up 1) the bugs, 2) the bug fixes.

sorry for the confusion, r.stream.order does not have any testsuite,
only v.stream.order has.

I agree, for tests during development, not for gunittests.

From the examples I read, gunittests expect a specific output. If the
expected output (obtained with an assumed correct version of the
module) is wrong, the gunittest is bogus. gunittests are ok to make
sure the output does not change, but not ok to make sure the output is
correct. Two random examples are r.stream.order and r.univar.

I don't understand your argument here or i have a principal problem in
understanding the test topic.

You have to implement a test that checks for correct output, this is the
meaning of a test.

Exactly. During development, however, you need to run many more tests
until you are confident that the output is correct. Then you submit
the changes. My point is that modules need to be tested thoroughly
during development (which is not always the case), and a testsuite to
make sure that the output matches expectations is nice to have. In
most cases,

<module_name> <args>
if [ $? -ne 0 ] ; then
  echo "ERROR: Module <module_name> failed"
  exit 1
fi

should do the job

no offence :-;

You have to design a test scenario from which you know
what the correct solution is and then you test for the correct solution.
What is with r.univar? Create a test map with a specific number of cells
with specific values and test if r.univar is able to compute the correct
values that you have validated by hand.

-- what is the mean, min and max of 10 cells each with value 5? Its 5! --

what is the correct standard deviation? sqrt((1/n) * SUM(x - mu)) or
sqrt((1/(n - 1)) * SUM(x - mu))?

r.univar uses sqrt((1/n) * SUM(x - mu)) but sqrt((1/(n - 1)) * SUM(x -
mu)) might be more appropriate because you could argue that raster
maps are always a sample. Apart from that, r.univar uses a one-pass
method to calculate stddev which is debatable.

Markus M

The most simple check for that is the raster range check in gunittest. If
you know what the range of the resulting raster map has to be, then you can
test for it. If this is not enough then you can check against the
uni-variate statistic output of the raster map, since you know for sure what
the result is for min, mean, median, max and so on. If this is not
sufficient use r.out.ascii and check against the correct solution that you
have created beforehand. If this is not sufficient then use pygrass and
investigate each raster cell of the resulting output map.

Best regards
Soeren

Markus M

>
> Best regards
> Soeren
>
>>
>> my2c
>>
>> Markus M
>>
>> >
>> > Best
>> > Sören
>> >
>> >>
>> >> One thing we could think about is activating the toolbox idea a bit
>> >> more
>> >> and creating a specific OBIA toolbox in addons.
>> >>
>> >>> Identified candidates could be added to core once they fulfill the
>> >>> requirements above. Would that happen only in minor releases or
>> >>> would
>> >>> that also be possible in point releases?
>> >>
>> >>
>> >> Adding modules to core is not an API change, so I don't see why they
>> >> can't
>> >> be added at any time. But then again, having a series of new modules
>> >> can be
>> >> sufficient to justify a new minor release :wink:
>> >>
>> >>> Or is that already too much formality and if someone wishes to see
>> >>> an
>> >>> addon in core that is simply discussed on ML?
>> >>
>> >>
>> >> Generally, I would think that discussion on ML is the best way to
>> >> handle
>> >> this.
>> >>
>> >> Moritz
>> >>
>> >>
>> >> _______________________________________________
>> >> grass-dev mailing list
>> >> grass-dev@lists.osgeo.org
>> >> http://lists.osgeo.org/mailman/listinfo/grass-dev
>> >
>> > _______________________________________________
>> > grass-dev mailing list
>> > grass-dev@lists.osgeo.org
>> > http://lists.osgeo.org/mailman/listinfo/grass-dev
>
>

2016-10-06 21:26 GMT+02:00 Markus Metz <markus.metz.giswork@gmail.com>:

On Tue, Oct 4, 2016 at 11:02 PM, Sören Gebbert
<soerengebbert@googlemail.com> wrote:
>
>
> 2016-10-04 22:22 GMT+02:00 Markus Metz <markus.metz.giswork@gmail.com>:
>>
>> Recently I fixed bugs in r.stream.order, related to stream length
>> calculations which are in turn used to determine stream orders. The
>> gunittest did not pick up 1) the bugs, 2) the bug fixes.

sorry for the confusion, r.stream.order does not have any testsuite,
only v.stream.order has.

>>
>> I agree, for tests during development, not for gunittests.
>>
>> From the examples I read, gunittests expect a specific output. If the
>> expected output (obtained with an assumed correct version of the
>> module) is wrong, the gunittest is bogus. gunittests are ok to make
>> sure the output does not change, but not ok to make sure the output is
>> correct. Two random examples are r.stream.order and r.univar.
>
>
> I don't understand your argument here or i have a principal problem in
> understanding the test topic.
>
> You have to implement a test that checks for correct output, this is the
> meaning of a test.

Exactly. During development, however, you need to run many more tests
until you are confident that the output is correct. Then you submit
the changes. My point is that modules need to be tested thoroughly
during development (which is not always the case), and a testsuite to
make sure that the output matches expectations is nice to have. In
most cases,

Implement all tests while developing a module as gunittests and you will
have
a testsuite in the end. You have to implement the tests anyway, so why not
using gunittests from the beginning,
as part of the development process?

If you implement a Python library, then use doctests to document and check
functions and classes while
developing them. These doctests are part of the documentation and excellent
examples howto use
a specific function or class. And by pure magic, you will have a testsuite
in the end.
Have a look at PyGRASS, tons of doctests that are code examples and
validation tests
at the same time.

<module_name> <args>
if [ $? -ne 0 ] ; then
  echo "ERROR: Module <module_name> failed"
  exit 1
fi

should do the job

Nope.

no offence :-;

> You have to design a test scenario from which you know
> what the correct solution is and then you test for the correct solution.
> What is with r.univar? Create a test map with a specific number of cells
> with specific values and test if r.univar is able to compute the correct
> values that you have validated by hand.
>
> -- what is the mean, min and max of 10 cells each with value 5? Its 5! --

what is the correct standard deviation? sqrt((1/n) * SUM(x - mu)) or
sqrt((1/(n - 1)) * SUM(x - mu))?

If you decide to use the first version, then implement tests for the first

version.
If you decide to use the second version, then ... .
If you decide to support both versions, then implement tests for both
versions.

r.univar uses sqrt((1/n) * SUM(x - mu)) but sqrt((1/(n - 1)) * SUM(x -
mu)) might be more appropriate because you could argue that raster
maps are always a sample. Apart from that, r.univar uses a one-pass
method to calculate stddev which is debatable.

If you decide to implement a specific version of stddev, then write a test
for it.
Debating which version is more appropriate has nothing to do with the
actual software development process.

Best regards
Soeren

Markus M

>
> The most simple check for that is the raster range check in gunittest. If
> you know what the range of the resulting raster map has to be, then you
can
> test for it. If this is not enough then you can check against the
> uni-variate statistic output of the raster map, since you know for sure
what
> the result is for min, mean, median, max and so on. If this is not
> sufficient use r.out.ascii and check against the correct solution that
you
> have created beforehand. If this is not sufficient then use pygrass and
> investigate each raster cell of the resulting output map.
>
> Best regards
> Soeren
>
>>
>> Markus M
>>
>> >
>> > Best regards
>> > Soeren
>> >
>> >>
>> >> my2c
>> >>
>> >> Markus M
>> >>
>> >> >
>> >> > Best
>> >> > Sören
>> >> >
>> >> >>
>> >> >> One thing we could think about is activating the toolbox idea a
bit
>> >> >> more
>> >> >> and creating a specific OBIA toolbox in addons.
>> >> >>
>> >> >>> Identified candidates could be added to core once they fulfill
the
>> >> >>> requirements above. Would that happen only in minor releases or
>> >> >>> would
>> >> >>> that also be possible in point releases?
>> >> >>
>> >> >>
>> >> >> Adding modules to core is not an API change, so I don't see why
they
>> >> >> can't
>> >> >> be added at any time. But then again, having a series of new
modules
>> >> >> can be
>> >> >> sufficient to justify a new minor release :wink:
>> >> >>
>> >> >>> Or is that already too much formality and if someone wishes to
see
>> >> >>> an
>> >> >>> addon in core that is simply discussed on ML?
>> >> >>
>> >> >>
>> >> >> Generally, I would think that discussion on ML is the best way to
>> >> >> handle
>> >> >> this.
>> >> >>
>> >> >> Moritz
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> grass-dev mailing list
>> >> >> grass-dev@lists.osgeo.org
>> >> >> http://lists.osgeo.org/mailman/listinfo/grass-dev
>> >> >
>> >> > _______________________________________________
>> >> > grass-dev mailing list
>> >> > grass-dev@lists.osgeo.org
>> >> > http://lists.osgeo.org/mailman/listinfo/grass-dev
>> >
>> >
>
>