[GRASS-dev] Add a cost distance measure to GRASS GIS lib

Hi All,

I have been pondering this for a while and would
like to know if people on this list think the
following would be a feasible/useful addition to
GRASS (7):

How about, in addition to the MASK raster, we'd
also allow the user to specify a COST raster?
This would allow all raster modules that currently
use an euclidean distance measure (KDE, IDW, ...)
to use a cost-based measure instead, without adding
more options to the individual modules.

In addition to the COST raster, we could modify the
basic distance function in the GRASS GIS lib to
automatically return a cost-based result if the
COST raster is present. This way, there would have
to be no or only minimal modifications to existing
modules.

For anisotropic cost distances, we could allow a
DIRECTION raster.

Ben

Benjamin Ducke wrote:

I have been pondering this for a while and would
like to know if people on this list think the
following would be a feasible/useful addition to
GRASS (7):

How about, in addition to the MASK raster, we'd
also allow the user to specify a COST raster?
This would allow all raster modules that currently
use an euclidean distance measure (KDE, IDW, ...)
to use a cost-based measure instead, without adding
more options to the individual modules.

In addition to the COST raster, we could modify the
basic distance function in the GRASS GIS lib to
automatically return a cost-based result if the
COST raster is present. This way, there would have
to be no or only minimal modifications to existing
modules.

This proposal makes no sense.

The mask raster is relatively cheap to implement, while a least-cost
path is one of the most expensive calculations available in GRASS.
Also, the mask raster affects every raster module which reads at least
one input map, while very few modules calculate distances.

Being able to calculate least-cost distances between arbitrary points
would require holding the entire cost raster in memory.

If you want least-cost distances from a fixed set of points to
arbitrary points, it's significantly more efficient to generate
distance rasters in advance (which requires knowing the starting
points).

If you want multiple least-cost distances between arbitrary points,
you would probably want to cache some intermediate results in order to
allow an upper bound on the distance to be calculated cheaply. But you
probably can't afford the memory to cache every intermediate result,
and knowing which ones are worth caching is application-specific.

--
Glynn Clements <glynn@gclements.plus.com>

I believe we are thinking in different directions here.
My idea was not about _least_ cost paths, but about
replacing straight-line distance measurements with
cost-based distances -- for more realism when modelling
physical processes. Just like you'd use a geodetic
distance instead of eculidean in a lat/lon region.

What I was thinking about was to have just one CONST
raster that encodes the cost of moving through each
of its cells (in all directions in the simple
isotropic case).

All GRASS modules that use fixed neighborhoods or
flexible search radii would then measure the distance
between two points by accumulating the costs in the
cells on the straight line between the two points.

Suppose the user has created a COST raster with "1"
representing "no additional cost" and "2" representing
"doubled cost for crossing steep terrain". Then if
he/she specified 100 m as e.g. the search radius for
a KDE, the maximum search distance will in some
extreme cases already be reached after approximately
50 m, because those 50 cost as much as 100 m on flat
terrain. That would make the kernel's shape flexible
and adjusted to the real terrain, instead of a fixed
circle.

The same idea translates 1:1 to IDW, r.neighbors, etc.

Ben

On Thu, Mar 1, 2012, at 17:14, Glynn Clements wrote:

Benjamin Ducke wrote:

> I have been pondering this for a while and would
> like to know if people on this list think the
> following would be a feasible/useful addition to
> GRASS (7):
>
> How about, in addition to the MASK raster, we'd
> also allow the user to specify a COST raster?
> This would allow all raster modules that currently
> use an euclidean distance measure (KDE, IDW, ...)
> to use a cost-based measure instead, without adding
> more options to the individual modules.
>
> In addition to the COST raster, we could modify the
> basic distance function in the GRASS GIS lib to
> automatically return a cost-based result if the
> COST raster is present. This way, there would have
> to be no or only minimal modifications to existing
> modules.

This proposal makes no sense.

The mask raster is relatively cheap to implement, while a least-cost
path is one of the most expensive calculations available in GRASS.
Also, the mask raster affects every raster module which reads at least
one input map, while very few modules calculate distances.

Being able to calculate least-cost distances between arbitrary points
would require holding the entire cost raster in memory.

If you want least-cost distances from a fixed set of points to
arbitrary points, it's significantly more efficient to generate
distance rasters in advance (which requires knowing the starting
points).

If you want multiple least-cost distances between arbitrary points,
you would probably want to cache some intermediate results in order to
allow an upper bound on the distance to be calculated cheaply. But you
probably can't afford the memory to cache every intermediate result,
and knowing which ones are worth caching is application-specific.

--
Glynn Clements <glynn@gclements.plus.com>

On 01/03/12 20:02, Benjamin Ducke wrote:

I believe we are thinking in different directions here.
My idea was not about _least_ cost paths, but about
replacing straight-line distance measurements with
cost-based distances -- for more realism when modelling
physical processes.

Yes, but how to define the distance if not by finding the least-cost path. As soon as you do not use straight-line distance anymore, you have to decide of which path you want to calculate the distance, and in order to make this decision, you generally use the least-cost path.

What I was thinking about was to have just one CONST
raster that encodes the cost of moving through each
of its cells (in all directions in the simple
isotropic case).

All GRASS modules that use fixed neighborhoods or
flexible search radii would then measure the distance
between two points by accumulating the costs in the
cells on the straight line between the two points.

So what you actually mean is somehow weight the distance by costs that can be found along the straigh-line path.

Suppose the user has created a COST raster with "1"
representing "no additional cost" and "2" representing
"doubled cost for crossing steep terrain". Then if
he/she specified 100 m as e.g. the search radius for
a KDE, the maximum search distance will in some
extreme cases already be reached after approximately
50 m, because those 50 cost as much as 100 m on flat
terrain. That would make the kernel's shape flexible
and adjusted to the real terrain, instead of a fixed
circle.

The same idea translates 1:1 to IDW, r.neighbors, etc.

Well, in IDW if you decide to use the 12 "nearest" points, how would you decide this ? Maybe point A is "further" in terms of costs along a straight line than B, but if you use the least-cost path A might still be "closer"...

Moritz

Benjamin Ducke wrote:

I believe we are thinking in different directions here.
My idea was not about _least_ cost paths, but about
replacing straight-line distance measurements with
cost-based distances -- for more realism when modelling
physical processes. Just like you'd use a geodetic
distance instead of eculidean in a lat/lon region.

What I was thinking about was to have just one CONST
raster that encodes the cost of moving through each
of its cells (in all directions in the simple
isotropic case).

All GRASS modules that use fixed neighborhoods or
flexible search radii would then measure the distance
between two points by accumulating the costs in the
cells on the straight line between the two points.

Suppose the user has created a COST raster with "1"
representing "no additional cost" and "2" representing
"doubled cost for crossing steep terrain". Then if
he/she specified 100 m as e.g. the search radius for
a KDE, the maximum search distance will in some
extreme cases already be reached after approximately
50 m, because those 50 cost as much as 100 m on flat
terrain. That would make the kernel's shape flexible
and adjusted to the real terrain, instead of a fixed
circle.

Even cost along a straight line (or maybe a great circle?) is
computationally expensive, and in the general case would require
holding the entire cost map in memory. Less-general cases would have
to be implemented within the module, as the libraries wouldn't know
how the module intends to access the cost map.

Also, many algorithms which have a concept of distance require that it
follows the conditions for a metric, i.e. for some metric d and all
points x, y, z:

  d(x,x) = 0
  d(x,y) >= 0
  d(x,y) = d(y,x)
  d(x,z) <= d(x,y) + d(y,z)

A distance calculated by integrating a cost map along a fixed path
could violate the fourth condition (transitivity).

Other algorithms (e.g. r.grow.distance) require that distance
increases monotonically with both delta-x and delta-y, and the
proposed solution would violate that.

--
Glynn Clements <glynn@gclements.plus.com>

Even cost along a straight line (or maybe a great circle?) is
computationally expensive, and in the general case would require
holding the entire cost map in memory. Less-general cases would have
to be implemented within the module, as the libraries wouldn't know
how the module intends to access the cost map.

Also, many algorithms which have a concept of distance require that it
follows the conditions for a metric, i.e. for some metric d and all
points x, y, z:

  d(x,x) = 0
  d(x,y) >= 0
  d(x,y) = d(y,x)
  d(x,z) <= d(x,y) + d(y,z)

A distance calculated by integrating a cost map along a fixed path
could violate the fourth condition (transitivity).

Other algorithms (e.g. r.grow.distance) require that distance
increases monotonically with both delta-x and delta-y, and the
proposed solution would violate that.

Right, a general and simple solution for cost-based distance
is not as straight forward as I thought it might be.

I will think about how this could be done for individual
modules instead -- and which modules would actually
benefit significantly from cost distances.

Thanks for all the feedback.

Ben

--
Glynn Clements <glynn@gclements.plus.com>