[GRASS-dev] [GRASS GIS] #2282: r.external.out: keep GRASS maps and external files in sync

#2282: r.external.out: keep GRASS maps and external files in sync
-------------------------+--------------------------------------------------
Reporter: sbl | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.1.0
Component: Default | Version: unspecified
Keywords: | Platform: Unspecified
      Cpu: Unspecified |
-------------------------+--------------------------------------------------
At the moment, GRASS maps and external files produced using r.external.out
are not fully kept in sync.

This applies both when maps are renamed or removed. Names of external
files are not changed by g.rename and files are not deleted by g.remove og
g.mremove.

If possible, it would be nice if GRASS maps and external files could be
kept in sync in order to keep the relations between them clear and for not
accumulating unused data (map files a user might think he deleted when
running g.remove). At least for maps produced with r.external.out this
would be nice (though maybe not for maps linked with r.external...).

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2282&gt;
GRASS GIS <http://grass.osgeo.org>

#2282: r.external.out: keep GRASS maps and external files in sync
-------------------------+--------------------------------------------------
Reporter: sbl | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.1.0
Component: Default | Version: unspecified
Keywords: | Platform: Unspecified
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [ticket:2282 sbl]:

> At the moment, GRASS maps and external files produced using
r.external.out are not fully kept in sync.
>
> This applies both when maps are renamed or removed. Names of external
files are not changed by g.rename and files are not deleted by g.remove or
g.mremove.

This behaviour is intentional; well, maybe "not entirely unintentional" is
more accurate.

If that's to change, it would need to be e.g. an option set by
r.external.out (for the default) and r.external (to allow it to be changed
for individual maps).

Certainly, it needs to be possible to delete the "link" which allows GRASS
to treat the file as a raster map without deleting the file itself. And
given that the existing g.remove behaviour is to remove the link but leave
the actual data, I'm not sure that should change (i.e. removing both the
link and the file needs a new flag, so no-one loses data unexpectedly).

Essentially, GDAL-linked maps are somewhat like a symlink to a file. Using
"rm" on a symlink deletes the symlink, not the file it refers to.
Similarly for "mv".

The GRASS tools (g.remove, g.rename, g.copy etc) need to be able to
manipulate the links. They don't necessarily have to be able to manipulate
the files; the OS' own file-management tools can handle that. Although, in
the case of renaming and copying, we should provide a convenient way to
update/copy the links; running r.external requires providing all of the
parameters again.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2282#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#2282: r.external.out: keep GRASS maps and external files in sync
-------------------------+--------------------------------------------------
Reporter: sbl | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.1.0
Component: Default | Version: unspecified
Keywords: | Platform: Unspecified
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by sbl):

Replying to [comment:1 glynn]:
>If that's to change, it would need to be e.g. an option set by
r.external.out (for the default) and r.external (to allow it to be changed
for individual maps).

Provided, that I have no idea how much work it would be to implement this,
it sounds like a very nice solution to me.

>And given that the existing g.remove behaviour is to remove the link but
leave the actual data, I'm not sure that should change (i.e. removing both
the link and the file needs a new flag, so no-one loses data
unexpectedly).

I see that this is a bit tricky, as it might be necessary to distinguish
maps linked to GRASS (for maps produced by other people or software which
have been linked into GRASS using r.external (which should not be meddled
with)) and maps/files produced from within GRASS. Furthermore, users may
choose map names differing from file names when they link them to GRASS
(r.external). However, an optional flag for keeping maps and files in sync
would be nice, if that is feasible.

Two other things to consider in this context are (maybe):

1) Let`s say a user sets the external output path to ~/output for GeoTiffs
with extension ."tif". Then the user produces a map with the name "map_a".
Later this "map_a" is renamed to "map_b". The file "~/output/map_a.tif"
will then be linked to "map_b", right? What happens in such a case, if
another map with the name "map_a" is produced. The map name does not exist
anymore in GRASS, but the file "~/output/map_a.tif" does, on the file
system. Is it checked that "~/output/map_a.tif" is not overwritten in such
a case.

2) Let`s assume one wants to use maps produced in GRASS with
r.external.out in other software. During map production the user finds
out, that a different name for the map would have been more suitable.
After running g.rename the new, more appropriate name is used in GRASS
while other software will still have to work with the old filename. The
way to sync that manually would be to remove the map (link) (instead of
renaming it), rename the underlying file (mv), and then link that file
again into GRASS (r.external). What happens to map history in such cases?

In general, I think having an option for keeping maps and files in sync
automatically would make it much more easy to maintain a tidy and clean
data storage with external files...

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2282#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#2282: r.external.out: keep GRASS maps and external files in sync
-------------------------+--------------------------------------------------
Reporter: sbl | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.1.0
Component: Default | Version: unspecified
Keywords: | Platform: Unspecified
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:2 sbl]:

> 1) Let`s say a user sets the external output path to ~/output for
GeoTiffs with extension ."tif". Then the user produces a map with the name
"map_a". Later this "map_a" is renamed to "map_b". The file
"~/output/map_a.tif" will then be linked to "map_b", right? What happens
in such a case, if another map with the name "map_a" is produced. The map
name does not exist anymore in GRASS, but the file "~/output/map_a.tif"
does, on the file system. Is it checked that "~/output/map_a.tif" is not
overwritten in such a case.

I don't think so. Not unless GDALCreate() performs such a check.

It's awkward for GRASS to do it, because the "filename" (i.e. dataset
name) might not actually correspond to a file (although I'm not sure that
the code correctly handles non-file datasets at present).

> 2) Let`s assume one wants to use maps produced in GRASS with
r.external.out in other software. During map production the user finds
out, that a different name for the map would have been more suitable.
After running g.rename the new, more appropriate name is used in GRASS
while other software will still have to work with the old filename. The
way to sync that manually would be to remove the map (link) (instead of
renaming it), rename the underlying file (mv), and then link that file
again into GRASS (r.external). What happens to map history in such cases?

Removing the GRASS map (with g.remove) will remove any metadata which
isn't part of the underlying GDAL dataset (e.g. image file).

Currently, the only metadata GRASS stores in the file is the region
(bounds and resolution). The code is there to store the projection (SRS),
but that information isn't available at present (we'd need to move
GPJ_grass_to_wkt() etc from libgproj into libgis, which would make GDAL a
direct dependency of libgis).

> In general, I think having an option for keeping maps and files in sync
automatically would make it much more easy to maintain a tidy and clean
data storage with external files...

Probably. The main issue is whether it's feasible to modify the existing
management tools (g.rename etc) to handle this, or whether we need
specific tools related to GDAL-linked (r.external) maps. I suspect that
most of these issues would also apply to OGR-linked (v.external) vector
maps.

BTW, there would also be the issue of how to handle filenames which aren't
valid map names (as per G_legal_filename()), e.g. those containing spaces.
Or map names which aren't valid filenames (GRASS allows :?|<> in map
names, but Windows doesn't allow those characters in filenames).

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2282#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>