[GRASS-user] integration of sample.c algorithms into r.resample

Dylan_Beaudette1 · August 9, 2006, 6:47pm

Hi,

I suppose that this is more for the developers...

I was thinking about integrating the 3 re-sampling algorithms (defined in
lib/gis/sample.c) into the current r.resample command. It is my understanding
that r.resample will currently only use the nearest neighbor algorithm.

would anyone else be interested in the addition of bilinear and cubic
convolution resampling to r.resample ?

PS: all that this will require is a bit of re-tooling in r.resmaple/main.c
such that it calls G_get_raster_sample() at each cell in the input map.

PSS: thanks to Markus for moving these algorithms out of v.sample, and into a
library style format.

Cheers,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

Dylan_Beaudette1 · August 9, 2006, 6:54pm

On Wednesday 09 August 2006 11:47, Dylan Beaudette wrote:

Hi,

I suppose that this is more for the developers...

I was thinking about integrating the 3 re-sampling algorithms (defined in
lib/gis/sample.c) into the current r.resample command. It is my
understanding that r.resample will currently only use the nearest neighbor
algorithm.

would anyone else be interested in the addition of bilinear and cubic
convolution resampling to r.resample ?

PS: all that this will require is a bit of re-tooling in r.resmaple/main.c
such that it calls G_get_raster_sample() at each cell in the input map.

PSS: thanks to Markus for moving these algorithms out of v.sample, and into
a library style format.

Cheers,

A quick follow-up:

in the source for r.resample, it looks like the resampling is done in this
loop:

for (row = 0; row < nrows; row++)
  {
    if (verbose)
      G_percent (row, nrows, 2);
    if (G_get_raster_row (infd, rast, row, data_type) < 0)
      exit(1);
    if (G_put_raster_row (outfd, rast, out_type) < 0)
      exit(1);
    G_mark_raster_cats (rast, ncols, &cats, data_type);
  }

...via the NN style resampling associated with converting between regions of
differing resolution.. (?)

Is this a correct interpretation?

thanks,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

Glynn_Clements1 · August 9, 2006, 7:42pm

Dylan Beaudette wrote:

A quick follow-up:

in the source for r.resample, it looks like the resampling is done in this
loop:

for (row = 0; row < nrows; row++)
  {
    if (verbose)
      G_percent (row, nrows, 2);
    if (G_get_raster_row (infd, rast, row, data_type) < 0)
      exit(1);
    if (G_put_raster_row (outfd, rast, out_type) < 0)
      exit(1);
    G_mark_raster_cats (rast, ncols, &cats, data_type);
  }

...via the NN style resampling associated with converting between regions of
differing resolution.. (?)

Is this a correct interpretation?

Yes. G_get_raster_row() etc automatically resample the raster data
(using nearest-neighbour) to match the current region resolution,
while G_put_raster_row() creates a map which matches the current
region.

If you want any other conversion, you have to keep switching regions,
so that input maps are read at their native resolution, while output
maps are written at the desired resolution.

For this reason, you may want to keep your new resampling program
separate from r.resample, as it will require a fundamentally different
structure.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan_Beaudette1 · August 9, 2006, 8:12pm

On Wednesday 09 August 2006 12:42, Glynn Clements wrote:

Dylan Beaudette wrote:
> A quick follow-up:
>
> in the source for r.resample, it looks like the resampling is done in
> this loop:
>
> for (row = 0; row < nrows; row++)
> {
> if (verbose)
> G_percent (row, nrows, 2);
> if (G_get_raster_row (infd, rast, row, data_type) < 0)
> exit(1);
> if (G_put_raster_row (outfd, rast, out_type) < 0)
> exit(1);
> G_mark_raster_cats (rast, ncols, &cats, data_type);
> }
>
>
> ...via the NN style resampling associated with converting between regions
> of differing resolution.. (?)
>
> Is this a correct interpretation?

Yes. G_get_raster_row() etc automatically resample the raster data
(using nearest-neighbour) to match the current region resolution,
while G_put_raster_row() creates a map which matches the current
region.

Thanks for the quick reply Glynn.

If you want any other conversion, you have to keep switching regions,
so that input maps are read at their native resolution, while output
maps are written at the desired resolution.

Yikes. That does complicate things quite a bit. I wonder if G_get_raster_row()
could be modified such that it can perform these three resampling strategies
as well. i.e. such that if one wanted a different resampling approach in any
function that uses G_get_raster_row(), it would be possible to specify
bilinear or cubic convolution... Although this might be a monumental task.

For this reason, you may want to keep your new resampling program
separate from r.resample, as it will require a fundamentally different
structure.

Sure. Perhaps a look at r.bilinear would be helpful. What I am envisioning is
something like r.resample / r.bilinear that would integrate the three
methods- as there doesn't appear to be an r.resamp.cubic solution.

Cheers,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

Dylan_Beaudette1 · August 9, 2006, 9:01pm

On Wednesday 09 August 2006 13:12, Dylan Beaudette wrote:

On Wednesday 09 August 2006 12:42, Glynn Clements wrote:
> Dylan Beaudette wrote:
> > A quick follow-up:
> >
> > in the source for r.resample, it looks like the resampling is done in
> > this loop:
> >
> > for (row = 0; row < nrows; row++)
> > {
> > if (verbose)
> > G_percent (row, nrows, 2);
> > if (G_get_raster_row (infd, rast, row, data_type) < 0)
> > exit(1);
> > if (G_put_raster_row (outfd, rast, out_type) < 0)
> > exit(1);
> > G_mark_raster_cats (rast, ncols, &cats, data_type);
> > }
> >
> >
> > ...via the NN style resampling associated with converting between
> > regions of differing resolution.. (?)
> >
> > Is this a correct interpretation?
>
> Yes. G_get_raster_row() etc automatically resample the raster data
> (using nearest-neighbour) to match the current region resolution,
> while G_put_raster_row() creates a map which matches the current
> region.

Thanks for the quick reply Glynn.

> If you want any other conversion, you have to keep switching regions,
> so that input maps are read at their native resolution, while output
> maps are written at the desired resolution.

Yikes. That does complicate things quite a bit. I wonder if
G_get_raster_row() could be modified such that it can perform these three
resampling strategies as well. i.e. such that if one wanted a different
resampling approach in any function that uses G_get_raster_row(), it would
be possible to specify bilinear or cubic convolution... Although this might
be a monumental task.

> For this reason, you may want to keep your new resampling program
> separate from r.resample, as it will require a fundamentally different
> structure.

Sure. Perhaps a look at r.bilinear would be helpful. What I am envisioning
is something like r.resample / r.bilinear that would integrate the three
methods- as there doesn't appear to be an r.resamp.cubic solution.

Cheers,

replying to myself again...

Ok, it looks like it won't be _too_ difficult to get the generalized sampling
routines into a derivative of r.bilinear !

Quickly hacking the r.bilinear source to include support for the above
mentioned routines:

/* old method */
new = (1.0 - t) * (1.0 - u) * c1 +
      t * (1.0 - u) * c2 +
      t * u * c4 +
      u * (1.0 - t) * c3;

outbuf[col] = (new + .5);

/* generalized sampling */
new_sample = G_get_raster_sample(infile, &w, NULL, north, east, 0, BILINEAR);

/* new method */
outbuf[col] = new_sample;

However, this does not quite work, as the _new_ method is ignoring issues
associated with edge effects on moving window calculations. for example:

u and t are set as follows:

/* on edges? */
      if (north >= (mapw.north - mapw.ns_res/2.)){
    maprow1 = 0;
    maprow2 = 1;
          u = 0.0;
      }
      else if (north <= (mapw.south + mapw.ns_res/2.)){
    maprow2 = mapw.rows - 1;
    maprow1 = maprow2 - 1;
          u = 1.0;
      }
... else ...
u = (mapnorth - north)/mapw.ns_res;

variable t is set in a similar fashion.

Perhaps a bit more work with this approach will yield a workable solution...

see attached figures for a graphical description of the edge effects
introduced:

1. comparison 1: note main point cloud and 1:1 line where the two methods are
quite close. also note the second cloud suggesting edge effects

2. comparison 2: simple display of residuals based on:
lm(sample.c::bilinear ~ r.bilinear)

Dylan

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

(attachments)

Glynn_Clements1 · August 9, 2006, 11:51pm

Dylan Beaudette wrote:

> If you want any other conversion, you have to keep switching regions,
> so that input maps are read at their native resolution, while output
> maps are written at the desired resolution.

Yikes. That does complicate things quite a bit. I wonder if G_get_raster_row()
could be modified such that it can perform these three resampling strategies
as well. i.e. such that if one wanted a different resampling approach in any
function that uses G_get_raster_row(), it would be possible to specify
bilinear or cubic convolution... Although this might be a monumental task.

I don't know about "monumental", but it would certainly complicate
matters. Apart from anything else, interpolation requires keeping
multiple rows in memory.

Aside from implementation issues, there are semantic issues, e.g. how
you define the interpolation algorithm near the boundaries of the map
or if adjoining cells are null.

Also, interpolation is only meaningful for scalar quantities, not for
discrete categories, and there's no automatic way to distinguish the
two (at least for CELL maps; floating-point maps should normally
represent scalar quantities, although they could be category maps
which have been processed by a module which uses DCELL throughout for
convenience).

--
Glynn Clements <glynn@gclements.plus.com>

Dylan_Beaudette1 · August 10, 2006, 12:34am

On Wednesday 09 August 2006 16:51, Glynn Clements wrote:

Dylan Beaudette wrote:
> > If you want any other conversion, you have to keep switching regions,
> > so that input maps are read at their native resolution, while output
> > maps are written at the desired resolution.
>
> Yikes. That does complicate things quite a bit. I wonder if
> G_get_raster_row() could be modified such that it can perform these three
> resampling strategies as well. i.e. such that if one wanted a different
> resampling approach in any function that uses G_get_raster_row(), it
> would be possible to specify bilinear or cubic convolution... Although
> this might be a monumental task.

I don't know about "monumental", but it would certainly complicate
matters. Apart from anything else, interpolation requires keeping
multiple rows in memory.

Aside from implementation issues, there are semantic issues, e.g. how
you define the interpolation algorithm near the boundaries of the map
or if adjoining cells are null.

Also, interpolation is only meaningful for scalar quantities, not for
discrete categories, and there's no automatic way to distinguish the
two (at least for CELL maps; floating-point maps should normally
represent scalar quantities, although they could be category maps
which have been processed by a module which uses DCELL throughout for
convenience).

Good points. G_get_raster_row() should probably be left alone then.

In the mean time, I was hoping to get a general purpose resampling module
together mostly for access to a cubic convolution interpolation.

In modifying r.bilinear, I have found that:

1. the output is not quite correct when the region is not aligned to the input
raster, including edge effects
2. the original r.bilinear algorithm is an order of magnitude faster than the
algorithm in sample.c

touching on issue #1 : the output from sample.c , as implemented in this
modified version of r.bilinear produces results that are very close to the
original r.bilinear when the region is aligned to the input raster. I have
attached some additional figures demonstrating this case.

Perhaps then, a module based on a generic raster module framework which calls
sample.c would do the trick. However, I am not quite sure how to account for
edge effects, or a region that is not aligned with the input raster-- the
code in r.bilinear is a bit too terse for me to grasp.

Any thoughts / insight would be very helpful.

Cheers,

Dylan

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

(attachments)

Glynn_Clements1 · August 10, 2006, 2:26am

Dylan Beaudette wrote:

> > > If you want any other conversion, you have to keep switching regions,
> > > so that input maps are read at their native resolution, while output
> > > maps are written at the desired resolution.
> >
> > Yikes. That does complicate things quite a bit. I wonder if
> > G_get_raster_row() could be modified such that it can perform these three
> > resampling strategies as well. i.e. such that if one wanted a different
> > resampling approach in any function that uses G_get_raster_row(), it
> > would be possible to specify bilinear or cubic convolution... Although
> > this might be a monumental task.
>
> I don't know about "monumental", but it would certainly complicate
> matters. Apart from anything else, interpolation requires keeping
> multiple rows in memory.
>
> Aside from implementation issues, there are semantic issues, e.g. how
> you define the interpolation algorithm near the boundaries of the map
> or if adjoining cells are null.
>
> Also, interpolation is only meaningful for scalar quantities, not for
> discrete categories, and there's no automatic way to distinguish the
> two (at least for CELL maps; floating-point maps should normally
> represent scalar quantities, although they could be category maps
> which have been processed by a module which uses DCELL throughout for
> convenience).

Good points. G_get_raster_row() should probably be left alone then.

In the mean time, I was hoping to get a general purpose resampling module
together mostly for access to a cubic convolution interpolation.

In modifying r.bilinear, I have found that:

1. the output is not quite correct when the region is not aligned to the input
raster, including edge effects
2. the original r.bilinear algorithm is an order of magnitude faster than the
algorithm in sample.c

The first thing I notice is that sample.c calls G_get_d_raster_row()
1, 2 or 4 times for each call, which will be a massive performance
hit.

The raster library caches the current row of *raw* data. This
elimnates the read() and decompression, but the rescaling, CELL->DCELL
conversion, embedding nulls etc will be performed on each call.
Moreover, the raster library only caches a single row, so the bilinear
and bicubic methods will actually perform the read and decompression
steps every time.

Any program which resamples an entire map should be caching the
processed row data, either using the rowio library or doing it itself.

touching on issue #1 : the output from sample.c , as implemented in this
modified version of r.bilinear produces results that are very close to the
original r.bilinear when the region is aligned to the input raster. I have
attached some additional figures demonstrating this case.

Perhaps then, a module based on a generic raster module framework which calls
sample.c would do the trick. However, I am not quite sure how to account for
edge effects, or a region that is not aligned with the input raster-- the
code in r.bilinear is a bit too terse for me to grasp.

Any thoughts / insight would be very helpful.

The main point is: don't use sample.c.

Second, read the map at its native resolution, with a slightly
expanded region to allow for the neighbouring cells
(G_get_raster_row() etc will automatically set cells outside the
raster to null). That way, you don't have to write separate cases to
deal with "edge" cells; the handling of null cells will cover it.

Then, it's just a case of iterating over the output grid, translating
the centre of each output cell to a map coordinate which is translated
back to cell coordinates relative to the input map. E.g. for bilinear:

  for (dst_row = 0; dst_row < dst_wind.rows; dst_row++)
    for (dst_col = 0; dst_col < dst_wind.cols; dst_col++)
    {
      double geo_x = dst_wind.west + (dst_col + 0.5) * dst_wind.ew_res;
      double geo_y = dst_wind.north - (dst_row + 0.5) * dst_wind.ns_res;

double src_u = (geo_x - src_wind.west ) / src_wind.ew_res - 0.5;
double src_v = (src_wind.north - geo_y) / src_wind.ns_res - 0.5;

      int col0 = (int) floor(src_u);
      int row0 = (int) floor(src_v);
      int col1 = col0 + 1;
      int row1 = row0 + 1;
      double u_frac = src_u - col0;
      double v_frac = src_v - row0;

      DCELL c00 = GET_SRC_CELL(row0, col0);
      DCELL c01 = GET_SRC_CELL(row0, col1);
      DCELL c10 = GET_SRC_CELL(row1, col0);
      DCELL c11 = GET_SRC_CELL(row1, col1);

DCELL c0 = c00 * (1 - u_frac) + c01 * u_frac;
DCELL c1 = c10 * (1 - u_frac) + c11 * u_frac;

DCELL c = c0 * (1 - v_frac) + c1 * v_frac;

PUT_DST_CELL(dst_row, dst_col, c);
}

Note that if src_wind == dst_wind, the dst->geo and geo->src
transformations are exact inverses, and the above reduces to a simple
copy (col0 == dst_col, row0 == dst_row, u_frac == v_frac == 0).

In practice, the code will be slightly complicated by the need to test
for null cells, setting the result to null if any of the four input
cells are null.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan_Beaudette1 · August 10, 2006, 3:41am

On Wednesday 09 August 2006 19:26, Glynn Clements wrote:

Dylan Beaudette wrote:
> > > > If you want any other conversion, you have to keep switching
> > > > regions, so that input maps are read at their native resolution,
> > > > while output maps are written at the desired resolution.
> > >
> > > Yikes. That does complicate things quite a bit. I wonder if
> > > G_get_raster_row() could be modified such that it can perform these
> > > three resampling strategies as well. i.e. such that if one wanted a
> > > different resampling approach in any function that uses
> > > G_get_raster_row(), it would be possible to specify bilinear or cubic
> > > convolution... Although this might be a monumental task.
> >
> > I don't know about "monumental", but it would certainly complicate
> > matters. Apart from anything else, interpolation requires keeping
> > multiple rows in memory.
> >
> > Aside from implementation issues, there are semantic issues, e.g. how
> > you define the interpolation algorithm near the boundaries of the map
> > or if adjoining cells are null.
> >
> > Also, interpolation is only meaningful for scalar quantities, not for
> > discrete categories, and there's no automatic way to distinguish the
> > two (at least for CELL maps; floating-point maps should normally
> > represent scalar quantities, although they could be category maps
> > which have been processed by a module which uses DCELL throughout for
> > convenience).
>
> Good points. G_get_raster_row() should probably be left alone then.
>
> In the mean time, I was hoping to get a general purpose resampling module
> together mostly for access to a cubic convolution interpolation.
>
> In modifying r.bilinear, I have found that:
>
> 1. the output is not quite correct when the region is not aligned to the
> input raster, including edge effects
> 2. the original r.bilinear algorithm is an order of magnitude faster
> than the algorithm in sample.c

The first thing I notice is that sample.c calls G_get_d_raster_row()
1, 2 or 4 times for each call, which will be a massive performance
hit.

The raster library caches the current row of *raw* data. This
elimnates the read() and decompression, but the rescaling, CELL->DCELL
conversion, embedding nulls etc will be performed on each call.
Moreover, the raster library only caches a single row, so the bilinear
and bicubic methods will actually perform the read and decompression
steps every time.

Any program which resamples an entire map should be caching the
processed row data, either using the rowio library or doing it itself.

I see. This is starting to get over my head in terms of both C programming and
the GRASS C API. Are there any _documented_ examples of this sort of
caching ?

> touching on issue #1 : the output from sample.c , as implemented in this
> modified version of r.bilinear produces results that are very close to
> the original r.bilinear when the region is aligned to the input raster. I
> have attached some additional figures demonstrating this case.
>
> Perhaps then, a module based on a generic raster module framework which
> calls sample.c would do the trick. However, I am not quite sure how to
> account for edge effects, or a region that is not aligned with the input
> raster-- the code in r.bilinear is a bit too terse for me to grasp.
>
> Any thoughts / insight would be very helpful.

The main point is: don't use sample.c.

OK.

For what I am trying to do, and for the benefit of other modules using an
interpolator, would it be worthwhile to re-invent what sample.c does, but
optimized for an entire raster -- or would it be better to somehow re-write
sample.c to make it more efficient ?

I was hoping for 2 goals: perhaps conflicting (?)

1. a library function which can perform NN, bilinear, cubic interpolation /
resampling based on an easting/northing pair. this would be useful for both
sampling at vector points _and_ hopefully re-sampling an entire raster to a
new resolution

2. a raster function which will perform one of the three interpolations /
resapling methods on an input raster, saving to an output raster. ideally
being as fast as possible.

in doing all of this, reduce the amount of code redundancy and bloat, while
increasing the amount of in-line documentation.

Second, read the map at its native resolution, with a slightly
expanded region to allow for the neighbouring cells
(G_get_raster_row() etc will automatically set cells outside the
raster to null). That way, you don't have to write separate cases to
deal with "edge" cells; the handling of null cells will cover it.

helpful...

Then, it's just a case of iterating over the output grid, translating
the centre of each output cell to a map coordinate which is translated
back to cell coordinates relative to the input map. E.g. for bilinear:

ok. this sounds like a dramatically different approach to what exists in
r.resample and r.bilinear. Should I just leave these alone, and try and come
up with an r.cubic ? OR, would it be a good time to replace the cruft of
r.bilinear with an optimized (?) version of the algorithm (like your
example), with a cubic method tacked on for good measure, into a new module?

  for (dst_row = 0; dst_row < dst_wind.rows; dst_row++)
    for (dst_col = 0; dst_col < dst_wind.cols; dst_col++)
    {
      double geo_x = dst_wind.west + (dst_col + 0.5) * dst_wind.ew_res;
      double geo_y = dst_wind.north - (dst_row + 0.5) * dst_wind.ns_res;

      double src_u = (geo_x - src_wind.west ) / src_wind.ew_res - 0.5;
      double src_v = (src_wind.north - geo_y) / src_wind.ns_res - 0.5;

      int col0 = (int) floor(src_u);
      int row0 = (int) floor(src_v);
      int col1 = col0 + 1;
      int row1 = row0 + 1;
      double u_frac = src_u - col0;
      double v_frac = src_v - row0;

      DCELL c00 = GET_SRC_CELL(row0, col0);
      DCELL c01 = GET_SRC_CELL(row0, col1);
      DCELL c10 = GET_SRC_CELL(row1, col0);
      DCELL c11 = GET_SRC_CELL(row1, col1);

      DCELL c0 = c00 * (1 - u_frac) + c01 * u_frac;
      DCELL c1 = c10 * (1 - u_frac) + c11 * u_frac;

      DCELL c = c0 * (1 - v_frac) + c1 * v_frac;

      PUT_DST_CELL(dst_row, dst_col, c);
    }

Question: will the above code run as-is ? some of these functions
like 'PUT_DST_CELL()' seem different from the others I have seen in the
source...?

Note that if src_wind == dst_wind, the dst->geo and geo->src
transformations are exact inverses, and the above reduces to a simple
copy (col0 == dst_col, row0 == dst_row, u_frac == v_frac == 0).

A good test to add to the first section of a new module.. ?

In practice, the code will be slightly complicated by the need to test
for null cells, setting the result to null if any of the four input
cells are null.

Sure.

Thanks for the feedback. I am interested in getting this moving, but only if
it will be of benefit to others as well. a hacked version of an old module
just for a cubic convolution filter is not worth much to me or the community.

Further pointers would be welcomed, as my C lack of expertise is starting to
show !

Cheers,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

Glynn_Clements1 · August 10, 2006, 8:05am

Dylan Beaudette wrote:

> > In modifying r.bilinear, I have found that:
> >
> > 1. the output is not quite correct when the region is not aligned to the
> > input raster, including edge effects
> > 2. the original r.bilinear algorithm is an order of magnitude faster
> > than the algorithm in sample.c
>
> The first thing I notice is that sample.c calls G_get_d_raster_row()
> 1, 2 or 4 times for each call, which will be a massive performance
> hit.
>
> The raster library caches the current row of *raw* data. This
> elimnates the read() and decompression, but the rescaling, CELL->DCELL
> conversion, embedding nulls etc will be performed on each call.
> Moreover, the raster library only caches a single row, so the bilinear
> and bicubic methods will actually perform the read and decompression
> steps every time.
>
> Any program which resamples an entire map should be caching the
> processed row data, either using the rowio library or doing it itself.

I see. This is starting to get over my head in terms of both C programming and
the GRASS C API. Are there any _documented_ examples of this sort of
caching ?

Source code is probably easier to understand. r.mapcalc (map.c; search
for "rowio") uses the rowio library, while r.grow maintains its own
row cache (search for "in_rows").

> > touching on issue #1 : the output from sample.c , as implemented in this
> > modified version of r.bilinear produces results that are very close to
> > the original r.bilinear when the region is aligned to the input raster. I
> > have attached some additional figures demonstrating this case.
> >
> > Perhaps then, a module based on a generic raster module framework which
> > calls sample.c would do the trick. However, I am not quite sure how to
> > account for edge effects, or a region that is not aligned with the input
> > raster-- the code in r.bilinear is a bit too terse for me to grasp.
> >
> > Any thoughts / insight would be very helpful.
>
> The main point is: don't use sample.c.

OK.

For what I am trying to do, and for the benefit of other modules using an
interpolator, would it be worthwhile to re-invent what sample.c does, but
optimized for an entire raster -- or would it be better to somehow re-write
sample.c to make it more efficient ?

The performance deficiencies are largely inherent in the
G_get_raster_sample() API. You could potentially refactor the code to
provide both the existing API and a more efficient alternative using
the same interpolation algorithms.

I was hoping for 2 goals: perhaps conflicting (?)

1. a library function which can perform NN, bilinear, cubic interpolation /
resampling based on an easting/northing pair. this would be useful for both
sampling at vector points _and_ hopefully re-sampling an entire raster to a
new resolution

2. a raster function which will perform one of the three interpolations /
resapling methods on an input raster, saving to an output raster. ideally
being as fast as possible.

in doing all of this, reduce the amount of code redundancy and bloat, while
increasing the amount of in-line documentation.

Anything you do with rasters will be more efficient if you can process
the raster data sequentially in a top-to-bottom sweep.

E.g. if you want interpolated values at vector points, the raster
access will be more efficient if you can sort the points into
descending order of their Y coordinates (i.e. north-to-south order).

[OTOH, the vector access may be more efficient if you process the
vertices in the order in which you get them. "Combination" solutions
would include storing the smaller of the two data sets in RAM (if you
have enough of it) and processing the larger in its most efficient
order, or processing large chunks of the larger data set at a time,
making multiple passes through the smaller data set. The relative
performance of an "ORDER BY ycoord" clause for the specific database
backend would also have an effect upon the optimal solution.]

In any case, if you are processing the raster top-to-bottom, you want
to take advantage of the fact that you will be making repeated access
to any given row (or group of adjacent rows) before moving on to the
next (i.e. caching rows rather than calling G_get_raster_row() for
every cell).

> Second, read the map at its native resolution, with a slightly
> expanded region to allow for the neighbouring cells
> (G_get_raster_row() etc will automatically set cells outside the
> raster to null). That way, you don't have to write separate cases to
> deal with "edge" cells; the handling of null cells will cover it.

helpful...

> Then, it's just a case of iterating over the output grid, translating
> the centre of each output cell to a map coordinate which is translated
> back to cell coordinates relative to the input map. E.g. for bilinear:

ok. this sounds like a dramatically different approach to what exists in
r.resample and r.bilinear.

r.resample is a special case, as it's just a trivial wrapper on top of
the nearest-neighbour resampling which is built in to the libgis
raster I/O code.

AFAICT, the approach I mentioned is essentially what r.bilinear is
doing at present, except for:

1. The dst->geo and geo->src transformations are performed by
G_col_to_easting() and G_easting_to_col() (similarly for the vertical
direction).

2. r.bilinear has special-case code for the top and bottom rows, and
for the leftmost and rightmost columns. This is marginally more
efficient than expanding the window (it saves reading two rows which
are known to be null) at the expense of clarity. If you were to extend
it from bilinear to bicubic (2 rows to 4 rows), this part would get a
lot more complex, and probably wouldn't be worth it.

3. It rolls its own version of floor(). Actually, it uses rounding;
I'm not sure if this results in a half-cell shift, or whether this is
compensating for an opposing half-cell shift elsewhere.

Should I just leave these alone, and try and come
up with an r.cubic ? OR, would it be a good time to replace the cruft of
r.bilinear with an optimized (?) version of the algorithm (like your
example), with a cubic method tacked on for good measure, into a new module?

The bilinear and cubic methods should go into the same module; you may
as well include a nearest-neighbour option as well.

If you start from r.bilinear, the main change would be that r.bilinear
has a hard-coded 2-row cache (inbuf1 and inbuf2), while you need at
least four rows for bicubic.

> for (dst_row = 0; dst_row < dst_wind.rows; dst_row++)
> for (dst_col = 0; dst_col < dst_wind.cols; dst_col++)
> {
> double geo_x = dst_wind.west + (dst_col + 0.5) * dst_wind.ew_res;
> double geo_y = dst_wind.north - (dst_row + 0.5) * dst_wind.ns_res;
>
> double src_u = (geo_x - src_wind.west ) / src_wind.ew_res - 0.5;
> double src_v = (src_wind.north - geo_y) / src_wind.ns_res - 0.5;
>
> int col0 = (int) floor(src_u);
> int row0 = (int) floor(src_v);
> int col1 = col0 + 1;
> int row1 = row0 + 1;
> double u_frac = src_u - col0;
> double v_frac = src_v - row0;
>
> DCELL c00 = GET_SRC_CELL(row0, col0);
> DCELL c01 = GET_SRC_CELL(row0, col1);
> DCELL c10 = GET_SRC_CELL(row1, col0);
> DCELL c11 = GET_SRC_CELL(row1, col1);
>
> DCELL c0 = c00 * (1 - u_frac) + c01 * u_frac;
> DCELL c1 = c10 * (1 - u_frac) + c11 * u_frac;
>
> DCELL c = c0 * (1 - v_frac) + c1 * v_frac;
>
> PUT_DST_CELL(dst_row, dst_col, c);
> }

Question: will the above code run as-is ? some of these functions
like 'PUT_DST_CELL()' seem different from the others I have seen in the
source...?

The above is pseudo-code, intended to demonstrate the algorithm. the
GET/PUT macros above would be replaced by code to read from and write
to the row buffers (e.g. inbuf1[mapcol1] and outbuf[col] in the
r.bilinear code).

You also have to actually read/write the rows (G_{get,put}_raster_row
or similar) at some point.

> Note that if src_wind == dst_wind, the dst->geo and geo->src
> transformations are exact inverses, and the above reduces to a simple
> copy (col0 == dst_col, row0 == dst_row, u_frac == v_frac == 0).

A good test to add to the first section of a new module.. ?

In what sense? It would be a sensible test case that you hadn't
introduced any off-by-one (or off-by-0.5) errors into the code (i.e.
"g.region rast=foo ; r.bilinear in=foo out=bar" should result in bar
being an exact copy of foo).

--
Glynn Clements <glynn@gclements.plus.com>

Maciej_Sieczka2 · August 11, 2006, 11:04pm

Dylan Beaudette napisa³(a):

1. the output is not quite correct when the region is not aligned to the input
raster, including edge effects

Actually r.bilinear is all screwed:
http://intevation.de/rt/webrt?serial_num=2790

There are two issues. Please see my most recent posts and Urs Gruber's
post dated Tue, Feb 14 2006 17:16:21.

It would be great to have r.bilinear fixed and all algorithms merged
into one command. Good luck!

Maciek

Glynn_Clements1 · August 12, 2006, 12:17am

Maciej Sieczka wrote:

> 1. the output is not quite correct when the region is not aligned to the input
> raster, including edge effects

Actually r.bilinear is all screwed:
http://intevation.de/rt/webrt?serial_num=2790

There are two issues. Please see my most recent posts and Urs Gruber's
post dated Tue, Feb 14 2006 17:16:21.

First, r.bilinear reads the input map as integer CELL values
(G_get_c_raster_row).

Second, it adds 0.5 to the interpolated result, then writes the output
map as FCELL. I would guess that the 0.5 is because it originally
rounded to the nearest integer.

If the output mostly or wholly contains values with a fractional part
of 0.5, that suggests that the sample coordinates are falling exactly
on the cell centres, so the u and t interpolation fractions are zero,
resulting in the output value being an integer input value plus 0.5.

Changing it to read the input as FCELL or DCELL rather than CELL would
be trivial. I'll commit a patch to do that (read and write DCELL), as
well as remove the +0.5.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan_Beaudette1 · August 12, 2006, 12:27am

Hi Maciek,

On Friday 11 August 2006 16:04, Maciej Sieczka wrote:

Dylan Beaudette napisa³(a):
> 1. the output is not quite correct when the region is not aligned to the
> input raster, including edge effects

Actually r.bilinear is all screwed:
http://intevation.de/rt/webrt?serial_num=2790
There are two issues. Please see my most recent posts and Urs Gruber's
post dated Tue, Feb 14 2006 17:16:21.

Hehe - I was waiting for you to chime in with that bug ! Indeed. Looking
through r.bilinear there are a lot of (data + 0.5) operations that I don't
quite understand -- my limited ability or small errors: not completely sure.

It would be great to have r.bilinear fixed and all algorithms merged
into one command. Good luck!

Sure. Well - with Glynn's help on the rowio library -- I am hoping on fixing
sample.c so that it uses row caching and verify that NULL support is
correctly done. Then, it would be a matter of calling the sample.c algorithms
at each new grid cell.

I would like to get this done, but it may take a little while. Any help would
be great.

Cheers,

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

Glynn_Clements1 · August 12, 2006, 12:47am

Glynn Clements wrote:

> > 1. the output is not quite correct when the region is not aligned to the input
> > raster, including edge effects
>
> Actually r.bilinear is all screwed:
> http://intevation.de/rt/webrt?serial_num=2790
>
> There are two issues. Please see my most recent posts and Urs Gruber's
> post dated Tue, Feb 14 2006 17:16:21.

First, r.bilinear reads the input map as integer CELL values
(G_get_c_raster_row).

Second, it adds 0.5 to the interpolated result, then writes the output
map as FCELL. I would guess that the 0.5 is because it originally
rounded to the nearest integer.

If the output mostly or wholly contains values with a fractional part
of 0.5, that suggests that the sample coordinates are falling exactly
on the cell centres, so the u and t interpolation fractions are zero,
resulting in the output value being an integer input value plus 0.5.

Changing it to read the input as FCELL or DCELL rather than CELL would
be trivial. I'll commit a patch to do that (read and write DCELL), as
well as remove the +0.5.

I've done this. A simple test with the output region equal to the
input map's region results in an exact copy (other than the
CELL->DCELL conversion).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn_Clements1 · August 12, 2006, 3:45am

Dylan Beaudette wrote:

> > 1. the output is not quite correct when the region is not aligned to the
> > input raster, including edge effects
>
> Actually r.bilinear is all screwed:
> http://intevation.de/rt/webrt?serial_num=2790
> There are two issues. Please see my most recent posts and Urs Gruber's
> post dated Tue, Feb 14 2006 17:16:21.

Hehe - I was waiting for you to chime in with that bug ! Indeed. Looking
through r.bilinear there are a lot of (data + 0.5) operations that I don't
quite understand -- my limited ability or small errors: not completely sure.

> It would be great to have r.bilinear fixed and all algorithms merged
> into one command. Good luck!

Sure. Well - with Glynn's help on the rowio library -- I am hoping on fixing
sample.c so that it uses row caching and verify that NULL support is
correctly done. Then, it would be a matter of calling the sample.c algorithms
at each new grid cell.

I would like to get this done, but it may take a little while. Any help would
be great.

I've attached a modified version of r.bilinear which does
nearest-neighbour, bilinear and bicubic interpolation. It's only been
lightly tested, but the case where the source and destination regions
match (exact copy) has been tested. It could use some testing at more
extreme scale factors.

It would be possible to speed it up a bit, at the expense of
simplicity, by caching the horizontal interpolates for each row.
However, even in its present form, it's still likely to be a lot
faster than using the functions in sample.c.

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

main.c (7.58 KB)

Maciej_Sieczka2 · August 12, 2006, 1:53pm

Glynn Clements napisa?(a):

I've attached a modified version of r.bilinear which does
nearest-neighbour, bilinear and bicubic interpolation. It's only been
lightly tested, but the case where the source and destination regions
match (exact copy) has been tested. It could use some testing at more
extreme scale factors.

It would be possible to speed it up a bit, at the expense of
simplicity, by caching the horizontal interpolates for each row.
However, even in its present form, it's still likely to be a lot
faster than using the functions in sample.c.

Glynn,

I have tested and compared your r.bilinear to gdalwarp output. I took
care to run gdalwarp and r.bilinear with identical settings.

In case of naearest neigbor the output is identical.

In case of biliner and bicubic there is a minor difference between the 2
programs output, min=-0.000002 and max=0.000002. Exactly the same in
case of both methods. But this is propably due to the bug that r.in.gdal
forces FCELL on output http://intevation.de/rt/webrt?serial_num=4897
while your r.bilinear outputs DCELL. So I guess it is not important.

(A bit OT: what are the Grass equivalents for Gdal's
Int16,UInt16,UInt32,Int32,Float32,Float64,CInt16,CInt32,CFloat32,CFloat64?
I'd like to put such note into r.out.gdal manual, it always puzzles me).

I was wondering what name should be given to the module. IMO it should
replace r.resample and r.bilinear should be removed. Is it acceptable
for Grass 6.3?

Here's the Spearfish scenario to reproduce my test:

g.region rast=slope -a
g.region res=15 -ap

r.bilinear input=slope output=gc_slope.nn method=nearest
r.bilinear input=slope output=gc_slope.bil method=bilinear
r.bilinear input=slope output=gc_slope.cub method=bicubic

### THE GDALWARP PART. -te switch lets us control the output extent to
duplicate r.bilinear settings.

r.out.gdal input=slope format=GTiff type=Float32 output=slope.tif

gdalwarp -te 589980 4913685 609000 4928040 -tr 15 15 -rn slope.tif
gdal_slope.nn.tif

gdalwarp -te 589980 4913685 609000 4928040 -tr 15 15 -rc slope.tif
gdal_slope.cub.tif

gdalwarp -te 589980 4913685 609000 4928040 -tr 15 15 -rb slope.tif
gdal_slope.bil.tif

### import gdalwarp output into spearfish
r.in.gdal input=/home/shoofi/gdal_slope.nn.tif output=gdal_slope.nn
r.in.gdal input=/home/shoofi/gdal_slope.bil.tif output=gdal_slope.bil
r.in.gdal input=/home/shoofi/gdal_slope.cub.tif output=gdal_slope.cub

### calculate the difference
r.mapcalc 'diff_nn=gdal_slope.nn-gc_slope.nn'
r.mapcalc 'diff_bil=gdal_slope.bil-gc_slope.bil'
r.mapcalc 'diff_cub=gdal_slope.cub-gc_slope.cub'

### see the difference:

r.info -r diff_nn
min=0.000000
max=0.000000

r.info -r diff_bil
min=-0.000002
max=0.000002

r.info -r diff_cub
min=-0.000002
max=0.000002

All looks good (TM).

Maciek

Glynn_Clements1 · August 12, 2006, 11:11pm

Maciej Sieczka wrote:

> I've attached a modified version of r.bilinear which does
> nearest-neighbour, bilinear and bicubic interpolation. It's only been
> lightly tested, but the case where the source and destination regions
> match (exact copy) has been tested. It could use some testing at more
> extreme scale factors.

> It would be possible to speed it up a bit, at the expense of
> simplicity, by caching the horizontal interpolates for each row.
> However, even in its present form, it's still likely to be a lot
> faster than using the functions in sample.c.

I have tested and compared your r.bilinear to gdalwarp output. I took
care to run gdalwarp and r.bilinear with identical settings.

In case of naearest neigbor the output is identical.

In case of biliner and bicubic there is a minor difference between the 2
programs output, min=-0.000002 and max=0.000002. Exactly the same in
case of both methods. But this is propably due to the bug that r.in.gdal
forces FCELL on output http://intevation.de/rt/webrt?serial_num=4897
while your r.bilinear outputs DCELL. So I guess it is not important.

For IEEE single precision, FLT_EPSILON is ~1.2e-7, so a value of
around 16-17 would have an absolute error of ~2e-6 just due to
rounding error. "r.info slope" indicates that the range of the data is
between zero and 52.5, so that error seems reasonable.

(A bit OT: what are the Grass equivalents for Gdal's
Int16,UInt16,UInt32,Int32,Float32,Float64,CInt16,CInt32,CFloat32,CFloat64?
I'd like to put such note into r.out.gdal manual, it always puzzles me).

I'm not sure there are equivalents.

Internally, GRASS has three types: CELL (C "int", typically 32-bit
signed integer), FCELL (C "float", typically 32-bit IEEE
single-precision floating-point) and DCELL (C "double", typically
64-bit IEEE double-precision floating-point).

I was wondering what name should be given to the module. IMO it should
replace r.resample and r.bilinear should be removed. Is it acceptable
for Grass 6.3?

I would suggest keeping r.resample. Its output is the result of GRASS'
built-in resampling, and is exactly what any raster module would
obtain from reading the map with the current region settings. The
output from method=nearest in the module which I sent may differ due
to rounding error.

--
Glynn Clements <glynn@gclements.plus.com>

Maciej_Sieczka2 · August 13, 2006, 5:51pm

Glynn Clements napisa?(a):

Maciej Sieczka wrote:

For IEEE single precision, FLT_EPSILON is ~1.2e-7, so a value of
around 16-17 would have an absolute error of ~2e-6 just due to
rounding error. "r.info slope" indicates that the range of the data is
between zero and 52.5, so that error seems reasonable.

(A bit OT: what are the Grass equivalents for Gdal's
Int16,UInt16,UInt32,Int32,Float32,Float64,CInt16,CInt32,CFloat32,CFloat64?
I'd like to put such note into r.out.gdal manual, it always puzzles me).

I'm not sure there are equivalents.

Internally, GRASS has three types: CELL (C "int", typically 32-bit
signed integer), FCELL (C "float", typically 32-bit IEEE
single-precision floating-point) and DCELL (C "double", typically
64-bit IEEE double-precision floating-point).

OK, thanks for clarification. In that case, what are the *most*
appropriate GDAL datatype equivalents for Grass's CELL, FCELL and DCELL?
By "most appropriate" I mean such that gurantee the original data will
be exactly preserved in r.out.gdal output, but as low superfluous
decimal points and bits are required as possible? So that the file is as
small as possible, and that after it is imported back with r.in.gdal, it
is an identical copy of the original one.

I was wondering what name should be given to the module. IMO it should
replace r.resample and r.bilinear should be removed. Is it acceptable
for Grass 6.3?

I would suggest keeping r.resample. Its output is the result of GRASS'
built-in resampling, and is exactly what any raster module would
obtain from reading the map with the current region settings. The
output from method=nearest in the module which I sent may differ due
to rounding error.

If I do some checking and it shows it doesn't differ, would that change
your attitude (even the backwards-compatibility could be assured by
setting the default resample method to nearest neighbor (if that's not
the case yet))?

Maciek

Glynn_Clements1 · August 13, 2006, 8:52pm

Maciej Sieczka wrote:

> For IEEE single precision, FLT_EPSILON is ~1.2e-7, so a value of
> around 16-17 would have an absolute error of ~2e-6 just due to
> rounding error. "r.info slope" indicates that the range of the data is
> between zero and 52.5, so that error seems reasonable.

>> (A bit OT: what are the Grass equivalents for Gdal's
>> Int16,UInt16,UInt32,Int32,Float32,Float64,CInt16,CInt32,CFloat32,CFloat64?
>> I'd like to put such note into r.out.gdal manual, it always puzzles me).

> I'm not sure there are equivalents.
>
> Internally, GRASS has three types: CELL (C "int", typically 32-bit
> signed integer), FCELL (C "float", typically 32-bit IEEE
> single-precision floating-point) and DCELL (C "double", typically
> 64-bit IEEE double-precision floating-point).

OK, thanks for clarification. In that case, what are the *most*
appropriate GDAL datatype equivalents for Grass's CELL, FCELL and DCELL?

AFAICT, the GDAL types would be Int32, Float32 and Float64
respectively.

>> I was wondering what name should be given to the module. IMO it should
>> replace r.resample and r.bilinear should be removed. Is it acceptable
>> for Grass 6.3?

> I would suggest keeping r.resample. Its output is the result of GRASS'
> built-in resampling, and is exactly what any raster module would
> obtain from reading the map with the current region settings. The
> output from method=nearest in the module which I sent may differ due
> to rounding error.

If I do some checking and it shows it doesn't differ, would that change
your attitude (even the backwards-compatibility could be assured by
setting the default resample method to nearest neighbor (if that's not
the case yet))?

It would need to produce exactly the same result for all possible
regions, which isn't something which can be verified empirically.

--
Glynn Clements <glynn@gclements.plus.com>

Dylan_Beaudette1 · August 14, 2006, 2:08am

On Friday 11 August 2006 20:45, Glynn Clements wrote:

Dylan Beaudette wrote:
> > > 1. the output is not quite correct when the region is not aligned to
> > > the input raster, including edge effects
> >
> > Actually r.bilinear is all screwed:
> > http://intevation.de/rt/webrt?serial_num=2790
> > There are two issues. Please see my most recent posts and Urs Gruber's
> > post dated Tue, Feb 14 2006 17:16:21.
>
> Hehe - I was waiting for you to chime in with that bug ! Indeed. Looking
> through r.bilinear there are a lot of (data + 0.5) operations that I
> don't quite understand -- my limited ability or small errors: not
> completely sure.
>
> > It would be great to have r.bilinear fixed and all algorithms merged
> > into one command. Good luck!
>
> Sure. Well - with Glynn's help on the rowio library -- I am hoping on
> fixing sample.c so that it uses row caching and verify that NULL support
> is correctly done. Then, it would be a matter of calling the sample.c
> algorithms at each new grid cell.
>
> I would like to get this done, but it may take a little while. Any help
> would be great.

I've attached a modified version of r.bilinear which does
nearest-neighbour, bilinear and bicubic interpolation. It's only been
lightly tested, but the case where the source and destination regions
match (exact copy) has been tested. It could use some testing at more
extreme scale factors.

It would be possible to speed it up a bit, at the expense of
simplicity, by caching the horizontal interpolates for each row.
However, even in its present form, it's still likely to be a lot
faster than using the functions in sample.c.

Very nice! much faster than anything I could have prototyped. This version
appears to work well with extreme sale factors, however i noticed one
possible problem.

When I use your new version of r.bilinear method=nearest on a CELL image, the
output is DCELL. While this can be fixed with r.mapcalc it shouldn't be too
hard to add this to the new module (?) .

other than that, great work!

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341