[GRASS-user] Overlapping pixels among DEM tiles to compute the LS factor for RUSLE

A billion-pixel scaled DEM is the main input to compute the slope length
and steepness (LS) factor for RUSLE (`r.watershed`), only.

Tiling this DEM in tiles of 5K^2 pixels (`r.tile`), appears to be a reasonable
approach to parallelise this process.

I have no idea if an overlap among the different tiles is required (to
avoid border effects!?) and, if yes, how many pixels it should be.

Are there practical guidelines? Do I need to study the LS factor algorithm?
Is it something that analysts with experience in the domain can figure
out empirically?

Thank you, Nikos

Hi Nikos,

On 2019-01-25 at 07:18 -0800, Nikos Alexandris <nik@nikosalexandris.net> wrote...

A billion-pixel scaled DEM is the main input to compute the slope length
and steepness (LS) factor for RUSLE (`r.watershed`), only.

Tiling this DEM in tiles of 5K^2 pixels (`r.tile`), appears to be a
reasonable approach to parallelise this process.

Do you need to parallelise it? I just ran r.watershed on a 4.5 billion-pixel DEM w/o a problem on my laptop and I think it took ~6 hours, but I'm not sure. It might have been closer to 12. I have 32 GB of ram and gave half to the process, but told it to use disk instead of memory via the -m flag:

r.watershed -s -m elevation=phi threshold=1000000 drainage=dir memory=16384 --v --o

  -k.

* Ken Mankoff <mankoff@gmail.com> [2019-01-25 11:50:29 -0800]:

Hi Nikos,

On 2019-01-25 at 07:18 -0800, Nikos Alexandris <nik@nikosalexandris.net> wrote...

A billion-pixel scaled DEM is the main input to compute the slope length
and steepness (LS) factor for RUSLE (`r.watershed`), only.

Tiling this DEM in tiles of 5K^2 pixels (`r.tile`), appears to be a
reasonable approach to parallelise this process.

As I learn more, it seems that it's completely wrong to tile this in
squares and that catchment/basin borders need to be essentially
respected.

Do you need to parallelise it? I just ran r.watershed on a 4.5 billion-pixel DEM w/o a problem on my laptop and I think it took ~6 hours, but I'm not sure. It might have been closer to 12. I have 32 GB of ram and gave half to the process, but told it to use disk instead of memory via the -m flag:

r.watershed -s -m elevation=phi threshold=1000000 drainage=dir memory=16384 --v --o

I just launched an all-in-one go, ~9e8 pixels finally. RAM is not an
issue here. Let's see the timing.

Thank you dear Ken, Nikos

* Nikos Alexandris <nik@nikosalexandris.net> [2019-01-26 00:13:38 +0100]:

* Ken Mankoff <mankoff@gmail.com> [2019-01-25 11:50:29 -0800]:

Hi Nikos,

On 2019-01-25 at 07:18 -0800, Nikos Alexandris <nik@nikosalexandris.net> wrote...

A billion-pixel scaled DEM is the main input to compute the slope length
and steepness (LS) factor for RUSLE (`r.watershed`), only.

Tiling this DEM in tiles of 5K^2 pixels (`r.tile`), appears to be a
reasonable approach to parallelise this process.

As I learn more, it seems that it's completely wrong to tile this in
squares and that catchment/basin borders need to be essentially
respected.

Do you need to parallelise it? I just ran r.watershed on a 4.5 billion-pixel DEM w/o a problem on my laptop and I think it took ~6 hours, but I'm not sure. It might have been closer to 12. I have 32 GB of ram and gave half to the process, but told it to use disk instead of memory via the -m flag:

r.watershed -s -m elevation=phi threshold=1000000 drainage=dir memory=16384 --v --o

I just launched an all-in-one go, ~9e8 pixels finally. RAM is not an
issue here. Let's see the timing.

So, it worked out in 40 minutes!

If I recall correctly, my initial attempt on my laptop (with 8GM of RAM
and without checking further options to use other available space)
wouldn't start the process. And I rushed over to read something like
8960000000 instead of 896000000. I am happy to have it done in 40
minutes now as well as mis-reading the number of pixels and having read
now a few things more on the subject. "We" also have this related
tutorial:
https://ncsu-geoforall-lab.github.io/erosion-modeling-tutorial/index.html

Nikos

Hi all,

I would like to step into the conversation - even if it was kind of
"solved" already.
Of course using watersheds instead of rectangular tiles for this kind of problem
is the way to go, even if rectangular tiles with a reasonable overlap
would probably
do the job just as good. Thus the answer to "do I need to know the
algorithm" is definitely
yes. In this case, you were able to run the code in a reasonable time,
so why bothering
with parallelization?
I actually believe that this is right about time for you to experiment
with parallelization
because if you got to the point in which you ask yourself if you need
a parallel code,
it is because the waiting time is already too much for you, even if
the code can actually
be run in one shot. It is very likely that you need to run your code
with many different
input parameters, and running many instances of the model in parallel
each with different
parameters may not be just good enough, since your next choice of
parameters may be
guided by the previous run.
Moreover, it is also very likely that your next project will require
simulating a much larger
area and having a parallel code ready, instead of figuring out on the
spot how to do that,
is key to get results when you need them.

Massi

Il giorno sab 26 gen 2019 alle ore 08:08 Nikos Alexandris
<nik@nikosalexandris.net> ha scritto:

* Nikos Alexandris <nik@nikosalexandris.net> [2019-01-26 00:13:38 +0100]:

>* Ken Mankoff <mankoff@gmail.com> [2019-01-25 11:50:29 -0800]:
>
>>Hi Nikos,
>>
>>On 2019-01-25 at 07:18 -0800, Nikos Alexandris <nik@nikosalexandris.net> wrote...
>>>A billion-pixel scaled DEM is the main input to compute the slope length
>>>and steepness (LS) factor for RUSLE (`r.watershed`), only.
>>>
>>>Tiling this DEM in tiles of 5K^2 pixels (`r.tile`), appears to be a
>>>reasonable approach to parallelise this process.
>
>As I learn more, it seems that it's completely wrong to tile this in
>squares and that catchment/basin borders need to be essentially
>respected.
>
>
>>Do you need to parallelise it? I just ran r.watershed on a 4.5 billion-pixel DEM w/o a problem on my laptop and I think it took ~6 hours, but I'm not sure. It might have been closer to 12. I have 32 GB of ram and gave half to the process, but told it to use disk instead of memory via the -m flag:
>>
>>r.watershed -s -m elevation=phi threshold=1000000 drainage=dir memory=16384 --v --o
>
>I just launched an all-in-one go, ~9e8 pixels finally. RAM is not an
>issue here. Let's see the timing.

So, it worked out in 40 minutes!

If I recall correctly, my initial attempt on my laptop (with 8GM of RAM
and without checking further options to use other available space)
wouldn't start the process. And I rushed over to read something like
8960000000 instead of 896000000. I am happy to have it done in 40
minutes now as well as mis-reading the number of pixels and having read
now a few things more on the subject. "We" also have this related
tutorial:
https://ncsu-geoforall-lab.github.io/erosion-modeling-tutorial/index.html

Nikos
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user