[GRASS-dev] On post-processing a raster map's mapcalc related history string

A mapcalc expression, involving somewhat long(er) temporary map
names, stored in the output raster map's history, looks like:

Comments:
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 1), 1,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 2), 2,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 3), 3,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 1), 4,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 2), 5,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 3), 6,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 1), 7,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 2), 8,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 3), 9)))))))))

Any smart ideas on how to post-process this to get rid of all 'tmp.32493.16.tmp.32493.11.tmp.32493.6.' parts?

Standard string manipulation utilities?

Or, rather, build an "fake" expression with the original map names just
for the sake of having a cleaner history?

Thank you, Nikos

On 20/08/18 17:59, Nikos Alexandris wrote:

A mapcalc expression, involving somewhat long(er) temporary map
names, stored in the output raster map's history, looks like:

Comments:
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
  && tmp.32493.17.recreation_opportunity == 1), 1,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
  && tmp.32493.17.recreation_opportunity == 2), 2,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
  && tmp.32493.17.recreation_opportunity == 3), 3,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
  && tmp.32493.17.recreation_opportunity == 1), 4,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
  && tmp.32493.17.recreation_opportunity == 2), 5,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
  && tmp.32493.17.recreation_opportunity == 3), 6,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
  && tmp.32493.17.recreation_opportunity == 1), 7,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
  && tmp.32493.17.recreation_opportunity == 2), 8,
  if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
  && tmp.32493.17.recreation_opportunity == 3), 9)))))))))

Any smart ideas on how to post-process this to get rid of all
'tmp.32493.16.tmp.32493.11.tmp.32493.6.' parts?

Standard string manipulation utilities?

I would guess so.

Or, rather, build an "fake" expression with the original map names just
for the sake of having a cleaner history?

Using r.support ?

Looking at your expression, I'm mostly wondering why you don't just use r.cross to get the same result ... :wink:

Moritz

* Moritz Lennert <mlennert@club.worldonline.be> [2018-08-20 18:08:18 +0200]:

On 20/08/18 17:59, Nikos Alexandris wrote:

A mapcalc expression, involving somewhat long(er) temporary map
names, stored in the output raster map's history, looks like:

Comments:
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 1), 1,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 2), 2,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
 && tmp.32493.17.recreation_opportunity == 3), 3,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 1), 4,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 2), 5,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
 && tmp.32493.17.recreation_opportunity == 3), 6,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 1), 7,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 2), 8,
 if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
 && tmp.32493.17.recreation_opportunity == 3), 9)))))))))

Any smart ideas on how to post-process this to get rid of all
'tmp.32493.16.tmp.32493.11.tmp.32493.6.' parts?

Standard string manipulation utilities?

I would guess so.

Or, rather, build an "fake" expression with the original map names just
for the sake of having a cleaner history?

Using r.support ?

The above "history" string is actually passed using `r.support`.

Looking at your expression, I'm mostly wondering why you don't just use r.cross to get the same result ... :wink:

In my in-code comments, ever since I started writing this script:

    - Why not use `r.cross`?

1) I just "translated" someone else's code and hade fun using `eval`.
2) Look back some time, r.cross has had problems with zeroes and NULLs.

As usual, your attention to details is much appreciated Moritz.

Nikos

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-20 20:11:57 +0200]:

* Moritz Lennert <mlennert@club.worldonline.be> [2018-08-20 18:08:18 +0200]:

On 20/08/18 17:59, Nikos Alexandris wrote:

A mapcalc expression, involving somewhat long(er) temporary map
names, stored in the output raster map's history, looks like:

Comments:
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 1), 1,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 2), 2,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 3), 3,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 1), 4,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 2), 5,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 3), 6,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 1), 7,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 2), 8,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 3), 9)))))))))

Any smart ideas on how to post-process this to get rid of all
'tmp.32493.16.tmp.32493.11.tmp.32493.6.' parts?

Standard string manipulation utilities?

I would guess so.

Or, rather, build an "fake" expression with the original map names just
for the sake of having a cleaner history?

Using r.support ?

The above "history" string is actually passed using `r.support`.

Obviously, reading the "Comment:" string above, that is a False
statement of mine (after hours of working).

Rather, the intention is to use `r.support`. It's just another "version"
of the script.

Nikos

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-20 20:11:57 +0200]:

* Moritz Lennert <mlennert@club.worldonline.be> [2018-08-20 18:08:18 +0200]:

On 20/08/18 17:59, Nikos Alexandris wrote:

A mapcalc expression, involving somewhat long(er) temporary map
names, stored in the output raster map's history, looks like:

Comments:
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 1), 1,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 2), 2,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 1
&& tmp.32493.17.recreation_opportunity == 3), 3,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 1), 4,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 2), 5,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 2
&& tmp.32493.17.recreation_opportunity == 3), 6,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 1), 7,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 2), 8,
if((tmp.32493.16.tmp.32493.11.tmp.32493.6.recreation_potential == 3
&& tmp.32493.17.recreation_opportunity == 3), 9)))))))))

Any smart ideas on how to post-process this to get rid of all
'tmp.32493.16.tmp.32493.11.tmp.32493.6.' parts?

Standard string manipulation utilities?

I would guess so.

Or, rather, build an "fake" expression with the original map names just
for the sake of having a cleaner history?

Using r.support ?

The above "history" string is actually passed using `r.support`.

Looking at your expression, I'm mostly wondering why you don't just use r.cross to get the same result ... :wink:

In my in-code comments, ever since I started writing this script:

  - Why not use `r.cross`?

1) I just "translated" someone else's code and hade fun using `eval`.

Double that false statement. `eval` is actually not related here.
It comes from another part, of the same script, which counts 3000+
lines. Fatigue...-

2) Look back some time, r.cross has had problems with zeroes and NULLs.

At least this one is True!, see also:

https://trac.osgeo.org/grass/ticket/3080
https://lists.osgeo.org/pipermail/grass-dev/2017-September/086090.html
and this thread, finally:
https://lists.osgeo.org/pipermail/grass-user/2018-February/077931.html

Nikos

On 21/08/18 01:50, Nikos Alexandris wrote:

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-20 20:11:57 +0200]:

2) Look back some time, r.cross has had problems with zeroes and NULLs.

At least this one is True!, see also:

https://trac.osgeo.org/grass/ticket/3080
https://lists.osgeo.org/pipermail/grass-dev/2017-September/086090.html
and this thread, finally:
https://lists.osgeo.org/pipermail/grass-user/2018-February/077931.html

But this seems to have been solved, now, or ?

Moritz

* Moritz Lennert <mlennert@club.worldonline.be> [2018-08-21 09:53:21 +0200]:

On 21/08/18 01:50, Nikos Alexandris wrote:

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-20 20:11:57 +0200]:

2) Look back some time, r.cross has had problems with zeroes and NULLs.

At least this one is True!, see also:

https://trac.osgeo.org/grass/ticket/3080
https://lists.osgeo.org/pipermail/grass-dev/2017-September/086090.html
and this thread, finally:
https://lists.osgeo.org/pipermail/grass-user/2018-February/077931.html

But this seems to have been solved, now, or ?

Right, need to revisit this part of the script.

Nikos

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-21 12:55:49 +0200]:

* Moritz Lennert <mlennert@club.worldonline.be> [2018-08-21 09:53:21 +0200]:

On 21/08/18 01:50, Nikos Alexandris wrote:

* Nikos Alexandris <nik@nikosalexandris.net> [2018-08-20 20:11:57 +0200]:

2) Look back some time, r.cross has had problems with zeroes and NULLs.

At least this one is True!, see also:

https://trac.osgeo.org/grass/ticket/3080
https://lists.osgeo.org/pipermail/grass-dev/2017-September/086090.html
and this thread, finally:
https://lists.osgeo.org/pipermail/grass-user/2018-February/077931.html

But this seems to have been solved, now, or ?

Right, need to revisit this part of the script.

There might be a problem in using `r.cross` in scripting:

"reclass(ifi)ed" maps that derive out of a "cross" map
(in which case, the "cross" is the "base" map)
have to be removed first. This will cost an extra `g.copy`
of the `reclassed` map.

I.e., last step of the following will fail:

a x b = crossmap
r.stats.zonal -r in=crossmap out=reclassed
g.remove raster name=crossmap

The `crossmap` serves only as an intermediate map. It is meant to be removed.
Removal of the `crossmap` will fail because the `reclassed` map
depends on it. Removing the `reclassed` map is not desired, since it is an
output.

An extra `g.copy` will solve this, i.e.

g.copy raster=reclassed,output

then force removal of both:

g.remove -f -b type=raster name=crossmap

This will still leave, in my system, a trace of the `reclassed` map:

g.list raster

will still print the `reclassed` map as a result, although `r.info
reclassed` will fail.

Is there an elegant alternative?

Nikos

On Thu, Aug 23, 2018 at 5:40 AM Nikos Alexandris <nik@nikosalexandris.net> wrote:

Is there an elegant alternative?

As for your script design/coding approach:

You were concerned about nice history at one point. You can wrap it, not just in a plain script, but in a module. Then you can hide any particulars of the implementation and write proper history in the module. This avoids changing history which would lead to history not being representative of what actually happened (provenance is the keyword here).

If you actually publish the module or not that’s a different question. Your script/code is needed to reproduce the data in anyway since you want to change history and additionally it would be reasonable to use your script/code to actually reproduce, so then the only difference is number of files and interface of the scripts (some custom solution versus module).

Splitting your code would also make sense because you are saying it is quite long.

Vaclav

* Vaclav Petras <wenzeslaus@gmail.com> [2018-08-23 19:09:47 -0400]:

On Thu, Aug 23, 2018 at 5:40 AM Nikos Alexandris <nik@nikosalexandris.net>
wrote:

Is there an elegant alternative?

As for your script design/coding approach:

You were concerned about nice history at one point. You can wrap it, not
just in a plain script, but in a module.

There are three things in mind, in order:

Provenance (as you name it below),
then re-usability/reproducibility
lastly legibility.

Then you can hide any particulars
of the implementation and write proper history in the module. This
avoids changing history which would lead to history not being
representative of what actually happened (provenance is the keyword
here).

Changing, or hiding, history is not wanted. As elsewhere, also in this
case, intermediate map names of some processing, get a prefix
using `script.core.tempfile()` [0], to get them easily removed at exit.

Removed if the user does not request for their output, which is an
option. If the user requests for their output, they are renamed to the
requested output name, so as to no be picked up by the clean up
`atexit.register(cleanup)` call that does this job.

[0] https://grass.osgeo.org/grass75/manuals/libpython/script.html?highlight=tempfile#script.core.tempfile

Specifically,

if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 1
&& tmp.22599.19.opportunity == 1), 1,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 1
&& tmp.22599.19.opportunity == 2), 2,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 1
&& tmp.22599.19.opportunity == 3), 3,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 2
&& tmp.22599.19.opportunity == 1), 4,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 2
&& tmp.22599.19.opportunity == 2), 5,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 2
&& tmp.22599.19.opportunity == 3), 6,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 3
&& tmp.22599.19.opportunity == 1), 7,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 3
&& tmp.22599.19.opportunity == 2), 8,
if((tmp.22599.18.tmp.22599.11.tmp.22599.6.recreation_potential == 3
&& tmp.22599.19.opportunity == 3), 9)))))))))

will be best represented as:

if((potential == 1 && opportunity == 1), 1,
if((potential == 1 && opportunity == 2), 2,
if((potential == 1 && opportunity == 3), 3,
if((potential == 2 && opportunity == 1), 4,
if((potential == 2 && opportunity == 2), 5,
if((potential == 2 && opportunity == 3), 6,
if((potential == 3 && opportunity == 1), 7,
if((potential == 3 && opportunity == 2), 8,
if((potential == 3 && opportunity == 3), 9)))))))))

Making it more meaningful and useful is the point.
By the way, I am still not sure how the combinations done by `r.cross`
will look like. The order of the end categories [1, 9] matters. To test,
perhaps easy to solve with an `r.categories` call and a proper set of
rules.

Some 'tmp' parts, in the names above, are prototyping leftovers,
and may be simply unnecessary. I have to correct/fix.

The question, anyhow still valid, is: what is the cost for "provenance"
if these prefixes are squashed? If the answer is "none" or "almost
none", then why not allow this squashing?

Are the design choices (program) made so far the right ones? Perhaps
they can be improved/corrected, and questions like the above can be
avoided altogether. I will look into this.

Further: what does it mean for "reproducibility"? Will it actually help
to understand better what is what, trace errors or support improving an
analysis? Perhaps, improve the algorithm/module itself?

As for legibility, I think it is as important as any other part of
programming. Spend less time in reading, more in thinking what can be
done with this tool.

If you actually publish the module or not that's a different question.

I, and my client, will publish it. It's mentioned here
https://grasswiki.osgeo.org/wiki/GRASS_GIS_Community_Sprint_Autumn_2017#In_person
and then here
https://grasswiki.osgeo.org/wiki/GRASS_GIS_Community_Sprint_Bonn_2018#In_person.

It's not yet there. There have been modifications in the underlying
algorithm(s). And there are processes to deal with licensing and
distribution.

We are doing our best.

Your script/code is needed to reproduce the data in anyway since you
want to change history and additionally it would be reasonable to use
your script/code to actually reproduce, so then the only difference is
number of files and interface of the scripts (some custom solution
versus module).

Splitting your code would also make sense because you are saying it is
quite long.

It does/is planned. It's been a long prototyping of the implementation.

Vaclav

Thank you Vaclav,
as usual your messages help a lot, in many directions.

Grateful to be (feel) part of this community,
Nikos