[GRASS-user] Global Hydrological modeling with r.watershed, r.stream.extract, r.stream.extract

Dear GRASS Team

I am running a global analysis where I need to use “tiles” as computational units in which I use the following three commands:

r.watershed -b elevation=elv depression=dep accumulation=flow drainage=dir_rw flow=pixel_area memory=100000 --o --verbose
r.stream.extract elevation=elv accumulation=flow depression=dep threshold=0.05 direction=dir_rs stream_raster=stream memory=100000 --o --verbose
r.stream.basins -l stream_rast=stream direction=dir_rs basins=lbasin memory=100000 --o --verbose

The basins that were not completely within a tile (resulting in broken-basins) have been removed (see below the three tiles in Figs. 1,2,3 including only entire basins),
and now I’m in the phase of merging all the tiles having only complete basins.

When I merge the tiles (Fig 4), some basin borders do not match perfectly, and some areas have NoData (see Fig 5,6) or have the Basin ID of the below basin (Fig 7).
I noticed that these phenomena appear only when I merge tiles that have very large broken-basins that can not be included in the tile due to RAM limitations.

My thought is that r.stream.basins needs the entire dimension of two adjacent basins to be able to detect the border without gap and without a potential random selection.
Is there any part of the r.stream.basins code that I can potentially check and eventually hack to avoid this problem?

For the rest, all the RAM limitation and other problem have been solved soon we will have a global stream network and basin delineation performed 100% in GRASS!!!

Thank you
Best Regards
Giuseppe

Fig 1. Left Tile
image.png

Fig 2. Center tile
image.png

Fig 3. Right Tile
image.png

Fig 4. Merge all the tiles
image.png

Fig 5. Gap → small white area

image.png

Fig 6. Gap → small white area

image.png

Fig 7. boarder Basins inconsistency among tiles

image.png

Giuseppe Amatulli, Ph.D.

Research scientist at
School of Forestry & Environmental Studies
Center for Research Computing
Yale University
New Haven, CT, USA
06511

Teaching: http://spatial-ecology.net
Work: https://environment.yale.edu/profile/giuseppe-amatulli/


Great job.

On Tue, May 5, 2020 at 11:56 AM Giuseppe Amatulli <giuseppe.amatulli@gmail.com> wrote:

Dear GRASS Team

I am running a global analysis where I need to use “tiles” as computational units in which I use the following three commands:

r.watershed -b elevation=elv depression=dep accumulation=flow drainage=dir_rw flow=pixel_area memory=100000 --o --verbose
r.stream.extract elevation=elv accumulation=flow depression=dep threshold=0.05 direction=dir_rs stream_raster=stream memory=100000 --o --verbose
r.stream.basins -l stream_rast=stream direction=dir_rs basins=lbasin memory=100000 --o --verbose

The basins that were not completely within a tile (resulting in broken-basins) have been removed (see below the three tiles in Figs. 1,2,3 including only entire basins),
and now I’m in the phase of merging all the tiles having only complete basins.

When I merge the tiles (Fig 4), some basin borders do not match perfectly, and some areas have NoData (see Fig 5,6) or have the Basin ID of the below basin (Fig 7).
I noticed that these phenomena appear only when I merge tiles that have very large broken-basins that can not be included in the tile due to RAM limitations.

My thought is that r.stream.basins needs the entire dimension of two adjacent basins to be able to detect the border without gap and without a potential random selection.
Is there any part of the r.stream.basins code that I can potentially check and eventually hack to avoid this problem?

For the rest, all the RAM limitation and other problem have been solved soon we will have a global stream network and basin delineation performed 100% in GRASS!!!

Thank you
Best Regards
Giuseppe

Fig 1. Left Tile
image.png

Fig 2. Center tile
image.png

Fig 3. Right Tile
image.png

Fig 4. Merge all the tiles
image.png

Fig 5. Gap → small white area

image.png

Fig 6. Gap → small white area

image.png

Fig 7. boarder Basins inconsistency among tiles

image.png

Giuseppe Amatulli, Ph.D.

Research scientist at
School of Forestry & Environmental Studies
Center for Research Computing
Yale University
New Haven, CT, USA
06511

Teaching: http://spatial-ecology.net
Work: https://environment.yale.edu/profile/giuseppe-amatulli/



grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Hi Guiseppe,

I've successfully run with 4.5 billion cells. How many cells do you have? I notice you do not have the "-m" flag to tell it to use disk swap in place of all memory. Maybe that would help?

  -k.

On 2020-05-05 at 11:55 -07, Giuseppe Amatulli <giuseppe.amatulli@gmail.com> wrote...

Dear GRASS Team

I am running a global analysis where I need to use "tiles" as computational
units in which I use the following three commands:
r.watershed -b elevation=elv depression=dep accumulation=flow
drainage=dir_rw flow=pixel_area memory=100000 --o --verbose
r.stream.extract elevation=elv accumulation=flow depression=dep
threshold=0.05 direction=dir_rs stream_raster=stream memory=100000 --o
--verbose
r.stream.basins -l stream_rast=stream direction=dir_rs basins=lbasin
memory=100000 --o --verbose

The basins that were not completely within a tile (resulting in
broken-basins) have been removed (see below the three tiles in Figs. 1,2,3
including only entire basins),
and now I'm in the phase of merging all the tiles having only complete
basins.

When I merge the tiles (Fig 4), some basin borders do not match perfectly,
and some areas have NoData (see Fig 5,6) or have the Basin ID of the below
basin (Fig 7).
I noticed that these phenomena appear only when I merge tiles that have
very large broken-basins that can not be included in the tile due to RAM
limitations.

My thought is that r.stream.basins needs the entire dimension of two
adjacent basins to be able to detect the border without gap and without a
potential random selection.
Is there any part of the r.stream.basins code that I can potentially check
and eventually hack to avoid this problem?

For the rest, all the RAM limitation and other problem have been solved
soon we will have a global stream network and basin delineation performed
100% in GRASS!!!

Thank you
Best Regards
Giuseppe

Fig 1. Left Tile
[image: image.png]

Fig 2. Center tile
[image: image.png]

Fig 3. Right Tile
[image: image.png]

Fig 4. Merge all the tiles
[image: image.png]

Fig 5. Gap -> small white area
[image: image.png]

Fig 6. Gap -> small white area
[image: image.png]

Fig 7. boarder Basins inconsistency among tiles

[image: image.png]

Thanks Ken,
I try that option but it was creating output files (e.g. flow accumulation) extremely huge, moreover it does not support the option -b.
Markus any suggestion?
Thanks
Giuseppe

···

Giuseppe Amatulli, Ph.D.

Research scientist at
School of Forestry & Environmental Studies
Center for Research Computing
Yale University
New Haven, CT, USA
06511

Teaching: http://spatial-ecology.net
Work: https://environment.yale.edu/profile/giuseppe-amatulli/

On Tue, May 12, 2020 at 12:03 AM Giuseppe Amatulli
<giuseppe.amatulli@gmail.com> wrote:

Thanks Ken,
I try that option but it was creating output files (e.g. flow accumulation) extremely huge, moreover it does not support the option -b.
Markus any suggestion?

Out of curiosity: which GRASS GIS version do you use on which operating system?
And: which raster compression do you use - the default? (r.compress -p
... will tell you)

Best,
markusN

Hi Markus,

I use GRASS 7.6.0under a HPC running
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 7.7 (Maipo)
Release: 7.7
Codename: Maipo

The compression is:

GRASS 7.6.0 (nc_spm_08_grass7):~ > r.compress -p map=flow
is compressed (method 5: ZSTD). Data type: DCELL
has a compressed NULL file
[Raster MASK present].

My option in selecting r.watershed with out the -m is manly due to be able to use the -b option and also be able to run the tile-computation in 6 hours vs several days (probably weeks) of the full globe (or continents) run in segmentation mode - made quit hard to do debugging and re-running.

Do you have any suggestions for the mismatching of the borders?

Best Giuseppe

···

Giuseppe Amatulli, Ph.D.

Research scientist at
School of Forestry & Environmental Studies
Center for Research Computing
Yale University
New Haven, CT, USA
06511

Teaching: http://spatial-ecology.net
Work: https://environment.yale.edu/profile/giuseppe-amatulli/

Hi Giuseppe,

On Wed, May 13, 2020 at 2:16 AM Giuseppe Amatulli
<giuseppe.amatulli@gmail.com> wrote:

Hi Markus,

I use GRASS 7.6.0
under a HPC running
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 7.7 (Maipo)
Release: 7.7
Codename: Maipo

The compression is:

GRASS 7.6.0 (nc_spm_08_grass7):~ > r.compress -p map=flow
<flow> is compressed (method 5: ZSTD). Data type: DCELL
<flow> has a compressed NULL file
[Raster MASK present].

My option in selecting r.watershed with out the -m is manly due to be able to use the -b option and also be able to run the tile-computation in 6 hours vs several days (probably weeks) of the full globe (or continents) run in segmentation mode - made quit hard to do debugging and re-running.

Do you have any suggestions for the mismatching of the borders?

I have no idea but it is recommended to update to GRASS GIS 7.8.3.

Best,
markusN

Dear all,
after several tests and a bit of inverse-engineering, Longzhu and I have understood the reason of such as a small difference in the delineation of the polygons basins.

All the process beyond r.watershed are computed in 3x3 moving window (e.g. slope, flow accumulation, etc.), therefore if we change the extent of study area we change the starting point of the mowing window and of course the values of the flow accumulation change a bit. These changes in the flow accumulation cause variation on the delineations of the basin borders, of adjacent tiles.

The reason, it is a bit obvious but did not come into my mind, immediately.

Now I’m trying to figure out a clever way to merge these flow accumulations that come from different tiles and that are slightly different.

Ciao
Giuseppe

···

Giuseppe Amatulli, Ph.D.

Research scientist at
School of Forestry & Environmental Studies
Center for Research Computing
Yale University
New Haven, CT, USA
06511

Teaching: http://spatial-ecology.net
Work: https://environment.yale.edu/profile/giuseppe-amatulli/