[GRASS-dev] [GRASS GIS] #2986: r.mapcalc with same variable on LHS and RHS

#2986: r.mapcalc with same variable on LHS and RHS
-------------------------+-------------------------
Reporter: mankoff | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Keywords: r.mapcalc | CPU: Unspecified
Platform: Unspecified |
-------------------------+-------------------------
It appears the following is valid syntax:

{{{
r.mapcalc "x = x + 1" --o
}}}

But a GRASS developer has said ([http://gis.stackexchange.com/a/109763/609
source]),

{{{
Note that you shouldn't use r.mapcalc like this (read from and write in
the same map):

r.mapcalc "old = old + 10" --overwrite
}}}

If this is the case, this seems like it **important** and should be
included in the documentation.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by marisn):

Such notice is already present in documentation (introduced in r66548 on
20 Oct 2015). See: https://grass.osgeo.org/grass71/manuals/r.mapcalc.html

{{{
... expression is any legal arithmetic expression involving existing
raster map layers (except result itself) ...
}}}

See also:
#2770;
https://lists.osgeo.org/pipermail/grass-dev/2011-July/054983.html

Either this bug should be closed or it should be turned into a wish - to
not allow same LHS variable to appear on RHS. As I'm not so much into
mapcalc language details, I can not see it as a problem, but Glynn will
shed more light on such option.

Mankoff, if you consider current documentation to be inadequate, provide a
patch with better wording.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:1&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------
Changes (by mankoff):

* Attachment "r.mapcalc.RHS_LHS.diff" added.

mention LHS = RHS issue in KNOWN ISSUES section of docs

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mankoff):

You're right it is in the docs. I think I checked old versions (6.4) and
did not see it, and it seems important enough it should be highlighted.
I've attached a patch adding this to the KNOWN ISSUES section of the docs.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:2&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------
Changes (by marisn):

* status: new => closed
* resolution: => fixed

Comment:

Documentation has been improved and changes have been backported to 7.0
(r68268 and r68269).

Thank you for reporting. Closing as solved.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:3&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by neteler):

Replying to [comment:3 marisn]:
> Documentation has been improved and changes have been backported to 7.0
(r68268 and r68269).

FWIW, also backported to G6.4.svn to r68277

> Thank you for reporting. Closing as solved.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:4&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mankoff):

Just to be clear, {{{LHS = RHS}}} **does** work. I've never actually had
it crash or return NULL or what appears to be invalid data when I've done
this. But perhaps it isn't guaranteed to work...

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:5&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:5 mankoff]:
> Just to be clear, {{{LHS = RHS}}} **does** work. I've never actually had
it crash or return NULL or what appears to be invalid data when I've done
this. But perhaps it isn't guaranteed to work...
It works by coincidence rather than by design.

The data (cell/fcell) file and null bitmap are written to temporary files,
which are renamed into place when the map is closed. The metadata files
(cellhd, quant, histogram, categoriess, colour table, history, etc), are
stored in memory and written when the map is closed. So the output map
shouldn't change while it's being read.

A specific case which won't quite work is "r.mapcalc 'map = map'", because
that copies the categories, colour table and history from the input to the
output, but closing the output map will delete or replace those before the
copy occurs. If the logic is ever extended to cover more cases, those will
similarly fail.

Actually, I'm not sure that it will work on Windows, because at the point
that the data and null files are renamed, the original files are still
open for read (r.mapcalc never actually closes the input maps). That
should be easy to fix, though (call close_maps() from within execute()
before the loop which closes the output maps).

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:6&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:6 glynn]:
> Replying to [comment:5 mankoff]:
> > Just to be clear, {{{LHS = RHS}}} **does** work.

Something else which doesn't quite work as expected is that once a name
appears on the LHS of an assignment, it's marked as being a variable
rather than a map. Which means that you can't use any modifiers on it
(whether the neighbourhood modifier, the category modifier or any colour
channel modifiers), as those can only be used on maps, not variables.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:7&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mankoff):

I'm still confused. The specific case of {{{r.mapcalc 'map=map'}}}
mentioned 2 comments above does seem to work for me in v7.0.3 (see code
example below). Also, the last comment appears to say that everything on
the LHS, even if a new unique LHSname, will be a variable and not a map
(I'm not even sure what "variables" are in GRASS). But the documentation
says that, //"result is the name of a raster map layer"//, and I thought
the whole point of {{{r.mapcalc}}} was to produce new raster maps.

Finally, there is no evidence, at least from {{{r.info}}}, of results
changing type. See for example results from,

{{{
grass70 -text ./nc_spm_08_grass7/PERMANENT
r.mapcalc "LHS = 1" --o
r.info LHS > rinfo.1
r.mapcalc "LHS = LHS" --o
r.info LHS > rinfo.2
r.mapcalc "LHS = LHS * 2" --o
r.info LHS > rinfo.3
diff3 rinfo.1 rinfo.2 rinfo.3
}}}

Finally, would it be possible to change this behavior, so that
{{{r.mapcalc 'map = map * 2'}}} is valid syntax? It seems like it would be
incredibly useful to support this standard programming syntax. If it isn't
a major code issue, I'll open a ticket with this feature request.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:8&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:8 mankoff]:
> I'm still confused. The specific case of {{{r.mapcalc 'map=map'}}}
mentioned 2 comments above does seem to work for me in v7.0.3 (see code
example below).

If the map had categories or quantisation rules, they will be discarded. A
colour table should also be discarded but isn't because
Rast_remove_colors(mapname,"") doesn't work (it compares the mapset
literally with the current mapset without checking for the empty string).

> Also, the last comment appears to say that everything on the LHS, even
if a new unique LHSname, will be a variable and not a map (I'm not even
sure what "variables" are in GRASS).
The result of any expression can be assigned to a variable; that's what
"=" does. If the assignment is a top-level expression (i.e. not a sub-
expression), the value is written to an output map with the same name as
the variable. Assignments other than at the top level (e.g. within eval(),
which exists for this specific purpose) don't generate an output map. E.g.
the i.pansharpen script uses the following r.mapcalc expression:
{{{
eval(k = "$ms1" + "$ms2" + "$ms3")
"$outr" = 1.0 * "$ms3" * "$panmatch3" / k
"$outg" = 1.0 * "$ms2" * "$panmatch2" / k
"$outb" = 1.0 * "$ms1" * "$panmatch1" / k
}}}

Here, k is just a variable; because it occurs as a sub-expression of the
eval() call rather than as a top-level expression, it doesn't result in an
output map named "k". Its value is used in the subsequent expressions,
avoiding the need to repeat the calculation for each one.

Regardless of whether a variable is written to an output map, it's still a
variable (i.e. something which is calculated for each row), not an input
map (thus you cannot use any modifiers on it). If you use a name for both
a variable and an input map, the variable hides the input map for all
expressions following the assignment which creates the variable.

> Finally, would it be possible to change this behavior, so that
{{{r.mapcalc 'map = map * 2'}}} is valid syntax?
No. It's already syntactically valid, the issue is that modifying a map
while it is being read is undefined behaviour which only "sort-of" works
(e.g. it may not work at all if r.external and r.external.out are in use
such that the input and output maps refer to the same data file). That's
an issue with the core GRASS libraries, and not something which r.mapcalc
can realistically work around.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:9&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mankoff):

Would it be possible to support this by making {{{r.mapcalc}}} a wrapper,
which replaces the {{{LHS}}} with something unique (if it is detected on
the {{{RHS}}}, and then as a final step, setting the user {{{result}}} to
the LHS-unique variable?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:10&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:10 mankoff]:
> Would it be possible to support this by making {{{r.mapcalc}}} a
wrapper, which replaces the {{{LHS}}} with something unique (if it is
detected on the {{{RHS}}}, and then as a final step, setting the user
{{{result}}} to the LHS-unique variable?
It can't realistically be implemented as a wrapper, and I'm not sure that
the behaviour would even be desirable. If change is warranted, it would be
better to just check for the situation where an output map will overwrite
any of the input maps, and report it as an error.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:11&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by wenzeslaus):

Replying to [comment:9 glynn]:
> The result of any expression can be assigned to a variable; that's what
"=" does.

Do I understand it correctly that re-assigning a variable inside sub-
expression like in the following example is OK?

{{{
r.mapcalc "a = eval(x = 1., x = x * 3, x / 2)"
}}}

In other words, can I use the above without any bad consequences?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:12&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:12 wenzeslaus]:

> Do I understand it correctly that re-assigning a variable inside sub-
expression like in the following example is OK?

I think so.

The parser replaces variable names with a reference to the last binding
for that variable, and bindings are appended to the list after the binding
is parsed (so if the RHS contains a reference to the LHS, it will refer to
the previous binding rather than the one being parsed).

So in that example, you have two distinct variables named "x". The first
sub-expression assigns the first one, the second sub-expression reads the
first and assigns to the second, the third sub-expression reads the
second.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:13&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by wenzeslaus):

In [changeset:"69730" 69730]:
{{{
#!CommitTicketReference repository="" revision="69730"
r.mapcalc: mention the LHS/RHS limits also in Notes and provide workaround
(see #2986, r66548, r68268)
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:14&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by wenzeslaus):

Replying to [comment:11 glynn]:
> I'm not sure that the behaviour would even be desirable.

Although, in GRASS or GIS in general, we usually create new data under new
names, from general programming perspective `a = a + 1` makes sense. The
following, which is an equivalent of `a = a + 1`, actually works:

{{{
r.neighbors in=test1 out=test1 size=25 --o
}}}

I'm not sure if it is actually legal. We should make it clear. Is it
module dependent? What about `G_check_input_output_name()`?

Doing this with vectors gives an error (''Output vector map <firestations>
is used as input''):

{{{
v.extract in=test2 cats=10,11,12,13 output=test2 --o
}}}

I assume this applies to all vector modules (and that
`Vect_check_input_output_name()` should be used to ensure it).

> If change is warranted, it would be better to just check for the
situation where an output map will overwrite any of the input maps, and
report it as an error.

Warning for start and perhaps an (fatal) error in the future are
appropriate. We should either support it or forbid it. (I don't know how
people interpret "the behavior is undefined" - it is common in C/C++
programming, but I'm not so sure about GIS).

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:15&gt;
GRASS GIS <https://grass.osgeo.org>

#2986: r.mapcalc with same variable on LHS and RHS
--------------------------+-------------------------
  Reporter: mankoff | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone: 7.0.4
Component: Docs | Version: 7.0.3
Resolution: fixed | Keywords: r.mapcalc
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by glynn):

Replying to [comment:15 wenzeslaus]:

> Is it module dependent?

Yes. It relies upon the module having read anything relevant from the
input map before it writes the corresponding data for the output map.

The cell, fcell and null files are written using tempfile-and-rename, i.e.
write the data to a temporary file, rename it into place within
Rast_close(). Support files are written directly, so it's necessary to
order reads and writes correctly (and take into account the fact that
Rast_close() writes or removes some support files).

For GDAL-linked maps (r.external, r.external.out), the behaviour is up to
GDAL, but I wouldn't expect in-place modification to work.

> What about `G_check_input_output_name()`?

Currently, only 5 modules appear to use it: i.pca, r.carve, r.relief,
r.slope.aspect, r.param.scale. If a module has multiple inputs and
multiple outputs, the number of possible combinations which need to be
checked could be large.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2986#comment:16&gt;
GRASS GIS <https://grass.osgeo.org>