[GRASS-user] Recommendations for v.surf.rst parameters for huge region

Hello list,

I am trying to interpolate raster values for a single massive dataset that
represents dozens of multibeam bathymetry surveys over a few decades.

The region is pretty big: all of Eastern Canada; at 100m resolution there are
~ 464M cells.

The raster data has a wide range of completeness; in some areas, there is
near 100% coverage - these areas were surveyed with modern sonars and 100%
overlap between survey lines. In other areas, the data is very sparse, with
~1km or more between tracklines. These areas would represent surveys from the
70's to 90's using singlebeam echosounders.

Firstly, would v.surf.rst be the best module for a massive interpolation job
like this? If not, could you recommend what would be the optimal method?

If v.surf.rst is the right module to use, I was wondering if anyone could help
with what parameters to use for an area this size, at least as a starting
point. I have read the manual several times, but I still don't have a good
intuition for how parameters like npmin, segmax, dmin, dmax, smooth all work
together.

At the moment, I have a script written that accepts a user-supplied number of
random positions all over the input raster. For each random point, I obtain
the east and north coordinates with v.to.db, feed these to g.region, and grow
the region around the point in all four cardinal directions by some value like
10,000m to create an analysis window around each point. I create a polygon
vector of this region with v.in.region, and use this polygon to clip my
vectorized raster bathymetry with v.clip, and then do v.surf.rst on this
clipped vector.

I have no other way of knowing what v.surf.rst parameters to use other than
trial and error, so I have a 4-level nested 'for' loop written to basically
traverse through all permutations of the parameters within the ranges I
chosen. So for example, I am exploring all combinations of parameters within
these ranges:

Tension: 10, 20, 40, 80, 160
npmin: 100, 200, 400, 800, 1000
smooth: 1, 2, 4, 8, 16, 32

With 7 random points selected around the full region, this script will produce
a few hundred maps per point, and quite a long time to run. And even more time
to look at the results. I know this isn't the best approach.

I am looking for help to find some workable set of parameters to use for the
entire dataset from other users who have more experience using this module.

Thanks,

--
Eric Patton
Marine Geoscience Technologist
Geological Survey of Canada (Atlantic)
Natural Resources Canada
Bedford Institute of Oceanography
1 Challenger Drive, Dartmouth NS, Canada

Hi,

I don’t have an answer but couple notes I can now think of:

  • Variable density of points may be a problem, I sometimes had that issue with gaps in lidar when it creates visible segments. In that case you need to increase npmin, but that substantially increases processing time. The default npmin is 300, but that is typically unnecessary, npmin 150 is usually ok and runs faster. The segments can be mitigated as well by higher tension. I usually use tension between 20-40. I haven’t used smooth value larger than 5 I think. But again, your data are probably quite different.

  • v.surf.rst is parallelized, you need to compile GRASS with openMP to take advantage of that

  • Have you looked at r.fill.stats? I used it for lidar, first r.in.lidar and then fill the rest with r.fill.stats, possibly multiple passes to fill larger holes. It uses IDW, and it will be significantly faster, although v.surf.rst would probably give you better results in the end.

Best,
Anna

On Fri, Oct 16, 2020 at 3:05 PM Eric Patton <eric.r.patton@protonmail.com> wrote:

Hello list,

I am trying to interpolate raster values for a single massive dataset that
represents dozens of multibeam bathymetry surveys over a few decades.

The region is pretty big: all of Eastern Canada; at 100m resolution there are
~ 464M cells.

The raster data has a wide range of completeness; in some areas, there is
near 100% coverage - these areas were surveyed with modern sonars and 100%
overlap between survey lines. In other areas, the data is very sparse, with
~1km or more between tracklines. These areas would represent surveys from the
70’s to 90’s using singlebeam echosounders.

Firstly, would v.surf.rst be the best module for a massive interpolation job
like this? If not, could you recommend what would be the optimal method?

If v.surf.rst is the right module to use, I was wondering if anyone could help
with what parameters to use for an area this size, at least as a starting
point. I have read the manual several times, but I still don’t have a good
intuition for how parameters like npmin, segmax, dmin, dmax, smooth all work
together.

At the moment, I have a script written that accepts a user-supplied number of
random positions all over the input raster. For each random point, I obtain
the east and north coordinates with v.to.db, feed these to g.region, and grow
the region around the point in all four cardinal directions by some value like
10,000m to create an analysis window around each point. I create a polygon
vector of this region with v.in.region, and use this polygon to clip my
vectorized raster bathymetry with v.clip, and then do v.surf.rst on this
clipped vector.

I have no other way of knowing what v.surf.rst parameters to use other than
trial and error, so I have a 4-level nested ‘for’ loop written to basically
traverse through all permutations of the parameters within the ranges I
chosen. So for example, I am exploring all combinations of parameters within
these ranges:

Tension: 10, 20, 40, 80, 160
npmin: 100, 200, 400, 800, 1000
smooth: 1, 2, 4, 8, 16, 32

With 7 random points selected around the full region, this script will produce
a few hundred maps per point, and quite a long time to run. And even more time
to look at the results. I know this isn’t the best approach.

I am looking for help to find some workable set of parameters to use for the
entire dataset from other users who have more experience using this module.

Thanks,


Eric Patton
Marine Geoscience Technologist
Geological Survey of Canada (Atlantic)
Natural Resources Canada
Bedford Institute of Oceanography
1 Challenger Drive, Dartmouth NS, Canada


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Just to add my 2¢ on one aspect: IMHO, unless you have some aesthetic or other reasons to do otherwise, smooth should reflect the error in your data. As it determines how far the interpolated surface can be from the observed values, the margin of error that you think your data has should be an adequate value.

I therefore wouldn’t really tune on smooth, but just set it to one reasonable value.

Moritz

Am 21. Oktober 2020 04:12:53 MESZ schrieb “Anna Petrášová” kratochanna@gmail.com:

Hi,

I don’t have an answer but couple notes I can now think of:

  • Variable density of points may be a problem, I sometimes had that issue with gaps in lidar when it creates visible segments. In that case you need to increase npmin, but that substantially increases processing time. The default npmin is 300, but that is typically unnecessary, npmin 150 is usually ok and runs faster. The segments can be mitigated as well by higher tension. I usually use tension between 20-40. I haven’t used smooth value larger than 5 I think. But again, your data are probably quite different.

  • v.surf.rst is parallelized, you need to compile GRASS with openMP to take advantage of that

  • Have you looked at r.fill.stats? I used it for lidar, first r.in.lidar and then fill the rest with r.fill.stats, possibly multiple passes to fill larger holes. It uses IDW, and it will be significantly faster, although v.surf.rst would probably give you better results in the end.

Best,
Anna

On Fri, Oct 16, 2020 at 3:05 PM Eric Patton <eric.r.patton@protonmail.com> wrote:

Hello list,

I am trying to interpolate raster values for a single massive dataset that
represents dozens of multibeam bathymetry surveys over a few decades.

The region is pretty big: all of Eastern Canada; at 100m resolution there are
~ 464M cells.

The raster data has a wide range of completeness; in some areas, there is
near 100% coverage - these areas were surveyed with modern sonars and 100%
overlap between survey lines. In other areas, the data is very sparse, with
~1km or more between tracklines. These areas would represent surveys from the
70’s to 90’s using singlebeam echosounders.

Firstly, would v.surf.rst be the best module for a massive interpolation job
like this? If not, could you recommend what would be the optimal method?

If v.surf.rst is the right module to use, I was wondering if anyone could help
with what parameters to use for an area this size, at least as a starting
point. I have read the manual several times, but I still don’t have a good
intuition for how parameters like npmin, segmax, dmin, dmax, smooth all work
together.

At the moment, I have a script written that accepts a user-supplied number of
random positions all over the input raster. For each random point, I obtain
the east and north coordinates with v.to.db, feed these to g.region, and grow
the region around the point in all four cardinal directions by some value like
10,000m to create an analysis window around each point. I create a polygon
vector of this region with v.in.region, and use this polygon to clip my
vectorized raster bathymetry with v.clip, and then do v.surf.rst on this
clipped vector.

I have no other way of knowing what v.surf.rst parameters to use other than
trial and error, so I have a 4-level nested ‘for’ loop written to basically
traverse through all permutations of the parameters within the ranges I
chosen. So for example, I am exploring all combinations of parameters within
these ranges:

Tension: 10, 20, 40, 80, 160
npmin: 100, 200, 400, 800, 1000
smooth: 1, 2, 4, 8, 16, 32

With 7 random points selected around the full region, this script will produce
a few hundred maps per point, and quite a long time to run. And even more time
to look at the results. I know this isn’t the best approach.

I am looking for help to find some workable set of parameters to use for the
entire dataset from other users who have more experience using this module.

Thanks,


Eric Patton
Marine Geoscience Technologist
Geological Survey of Canada (Atlantic)
Natural Resources Canada
Bedford Institute of Oceanography
1 Challenger Drive, Dartmouth NS, Canada


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

On 20/10/20 10:12PM, Anna Petrášová wrote:

Hi,

I don't have an answer but couple notes I can now think of:

- Variable density of points may be a problem, I sometimes had that issue with
gaps in lidar when it creates visible segments. In that case you need to
increase npmin, but that substantially increases processing time. The default
npmin is 300, but that is typically unnecessary, npmin 150 is usually ok and
runs faster. The segments can be mitigated as well by higher tension. I usually
use tension between 20-40. I haven't used smooth value larger than 5 I think.
But again, your data are probably quite different.

Hi Anna, thanks for your comments. My experimentation agrees with the range of
values you mention here.

- v.surf.rst is parallelized, you need to compile GRASS with openMP to take
advantage of that

Yep, done.

- Have you looked at r.fill.stats? I used it for lidar, first r.in.lidar and
then fill the rest with r.fill.stats, possibly multiple passes to fill larger
holes. It uses IDW, and it will be *significantly* faster, although v.surf.rst
would probably give you better results in the end.

I have not tried this module yet, but I will. I am somewhat confused about which
raster fill/interpolation module to use in which circumstance, as there are so
many (there are 6 r.resamp.* modules, 2 r.fill* modules...plus r.neighbours,
r.mapcalc, r.surf.{nnbathy,rst,idw}, r.mblend, etc.)

I have been running r.mblend for 10 days now, with no sign or ending in sight.
It is stuck on 'Buffering areas' for the last two days with no progress percentage
written to the terminal, so I am going to have to kill it.
  
Cheers,

~ Eric.

On Mon, Oct 26, 2020 at 12:38 PM Eric Patton <eric.r.patton@protonmail.com> wrote:

On 20/10/20 10:12PM, Anna Petrášová wrote:

Hi,

I don’t have an answer but couple notes I can now think of:

  • Variable density of points may be a problem, I sometimes had that issue with
    gaps in lidar when it creates visible segments. In that case you need to
    increase npmin, but that substantially increases processing time. The default
    npmin is 300, but that is typically unnecessary, npmin 150 is usually ok and
    runs faster. The segments can be mitigated as well by higher tension. I usually
    use tension between 20-40. I haven’t used smooth value larger than 5 I think.
    But again, your data are probably quite different.

Hi Anna, thanks for your comments. My experimentation agrees with the range of
values you mention here.

  • v.surf.rst is parallelized, you need to compile GRASS with openMP to take
    advantage of that

Yep, done.

  • Have you looked at r.fill.stats? I used it for lidar, first r.in.lidar and
    then fill the rest with r.fill.stats, possibly multiple passes to fill larger
    holes. It uses IDW, and it will be significantly faster, although v.surf.rst
    would probably give you better results in the end.

I have not tried this module yet, but I will. I am somewhat confused about which
raster fill/interpolation module to use in which circumstance, as there are so
many (there are 6 r.resamp.* modules, 2 r.fill* modules…plus r.neighbours,
r.mapcalc, r.surf.{nnbathy,rst,idw}, r.mblend, etc.)

Unfortunately, you need to read the manual, I agree it can be confusing. I have been mostly using v.surf.rst for interpolation and r.fill.stats for gap filling.

I have been running r.mblend for 10 days now, with no sign or ending in sight.
It is stuck on ‘Buffering areas’ for the last two days with no progress percentage
written to the terminal, so I am going to have to kill it.

I think r.mblend is for seamless combining higher and lower-resolution data, to e.g. update lidar with UAS, but your surfaces need to be already interpolated. Addon r.patch.smooth does something similar, but it’s raster based, so it might run faster, not sure if it is relevant for your use case though.

Anna

Cheers,

~ Eric.