[GRASS-user] Averaging multiple vector lines

I have a GRASS vector that originated as multiple GPS tracks from walking a particular trail segment on several different days. Is there a good way to average these lines to get a single line? I want to minimize GPS accuracy errors by averaging across multiple days and also minimize precision errors (random jumping around on a single day) while still maintaining the shape of the trail with all of its twists and turns.

I have been able to generate a composite vector by using a combination of v.to.rast, r.grow, r.thin, r.to.vect, v.clean, and v.generalize method=douglas. This method works pretty well when the lines remain close together, but it is very dependent on picking a value for the r.grow radius that fills in all of the gaps between the multiple tracks. If one track is quite different than the others in even a single region of the vector, this requires a relatively large radius value. Moreover, the final vector is located about midway between the two extremes rather than being weighted toward where the majority of tracks fall.

It seems like there would be a way to calculate some sort of sliding average of the coordinates that fall within a certain size window, perhaps after using v.to.points with a small dmax (5 ft?) to generate a fairly dense set of points. Ideally, the calculation window could be wider perpendicular to the direction of the line than it is along the direction of the line. From day to day tracks are often within 10 to 20 ft of each other, but it is not uncommon for two tracks to be 30 ft away from each other at some points.

Any ideas?

-Thanks, -Dwight

On Wed, Jun 3, 2009 at 6:23 PM, Dwight Needels <needels@translucida.com> wrote:

I have a GRASS vector that originated as multiple GPS tracks from walking a
particular trail segment on several different days. Is there a good way to
average these lines to get a single line? I want to minimize GPS accuracy
errors by averaging across multiple days and also minimize precision errors
(random jumping around on a single day) while still maintaining the shape of
the trail with all of its twists and turns.

I have been able to generate a composite vector by using a combination of
v.to.rast, r.grow, r.thin, r.to.vect, v.clean, and v.generalize
method=douglas. This method works pretty well when the lines remain close
together, but it is very dependent on picking a value for the r.grow radius
that fills in all of the gaps between the multiple tracks. If one track is
quite different than the others in even a single region of the vector, this
requires a relatively large radius value. Moreover, the final vector is
located about midway between the two extremes rather than being weighted
toward where the majority of tracks fall.

It seems like there would be a way to calculate some sort of sliding average
of the coordinates that fall within a certain size window, perhaps after
using v.to.points with a small dmax (5 ft?) to generate a fairly dense set
of points. Ideally, the calculation window could be wider perpendicular to
the direction of the line than it is along the direction of the line. From
day to day tracks are often within 10 to 20 ft of each other, but it is not
uncommon for two tracks to be 30 ft away from each other at some points.

Any ideas?

-Thanks, -Dwight

I have often wanted to do something like this with GPS tracks, however
I have never thought to try your vector-raster-vector approach -- very
creative!

I think that a vector-based approach could be implemented along the
lines you mention:

1. v.to.points on each GPS track
2. v.patch to collect all points into single vector
3. new module to generate an average 'centerline' along the cloud of points.

This last approach could be done fairly nicely in cartesian space with
a smoothing algorithm such as lowess or supersmooth.

Here is an example in R, graphic attached.

# densified collection of points along a single GPS track
x <- rnorm(n=100, mean=1, sd=0.1)

# check
plot(x, type='b')

# generate 10 densified GPS tracks, based on our original track
m <- x + matrix(rnorm(n=1000, mean=0.25, sd=0.05), ncol=10)

# check
matplot(m, type='l', lty=1, col=1, ylab='y-coordinate', xlab='x-coordinate')

# convert from wide to long format, as dataframe
d <- data.frame(x=rep(1:100, 10), y=as.vector(m))

# compute lowess smooth, and plot as red line
lines(lowess(d$x, d$y, f=0.01), col='red', lwd=2)

So, it may be possible to augment the v.generalize command to work
with a collection of nodes (i.e. accept multiple vector inputs). Or,
an implementation of the lowess algorithm would be another approach.

Cheers,
Dylan

(attachments)

lowess-gps-smooth.pdf (82.6 KB)

On Thu, Jun 4, 2009 at 9:41 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

On Wed, Jun 3, 2009 at 6:23 PM, Dwight Needels <needels@translucida.com> wrote:

I have a GRASS vector that originated as multiple GPS tracks from walking a
particular trail segment on several different days. Is there a good way to
average these lines to get a single line? I want to minimize GPS accuracy
errors by averaging across multiple days and also minimize precision errors
(random jumping around on a single day) while still maintaining the shape of
the trail with all of its twists and turns.

I have been able to generate a composite vector by using a combination of
v.to.rast, r.grow, r.thin, r.to.vect, v.clean, and v.generalize
method=douglas. This method works pretty well when the lines remain close
together, but it is very dependent on picking a value for the r.grow radius
that fills in all of the gaps between the multiple tracks. If one track is
quite different than the others in even a single region of the vector, this
requires a relatively large radius value. Moreover, the final vector is
located about midway between the two extremes rather than being weighted
toward where the majority of tracks fall.

It seems like there would be a way to calculate some sort of sliding average
of the coordinates that fall within a certain size window, perhaps after
using v.to.points with a small dmax (5 ft?) to generate a fairly dense set
of points. Ideally, the calculation window could be wider perpendicular to
the direction of the line than it is along the direction of the line. From
day to day tracks are often within 10 to 20 ft of each other, but it is not
uncommon for two tracks to be 30 ft away from each other at some points.

Any ideas?

-Thanks, -Dwight

I have often wanted to do something like this with GPS tracks, however
I have never thought to try your vector-raster-vector approach -- very
creative!

I think that a vector-based approach could be implemented along the
lines you mention:

1. v.to.points on each GPS track
2. v.patch to collect all points into single vector
3. new module to generate an average 'centerline' along the cloud of points.

This last approach could be done fairly nicely in cartesian space with
a smoothing algorithm such as lowess or supersmooth.

Here is an example in R, graphic attached.

# densified collection of points along a single GPS track
x <- rnorm(n=100, mean=1, sd=0.1)

# check
plot(x, type='b')

# generate 10 densified GPS tracks, based on our original track
m <- x + matrix(rnorm(n=1000, mean=0.25, sd=0.05), ncol=10)

# check
matplot(m, type='l', lty=1, col=1, ylab='y-coordinate', xlab='x-coordinate')

# convert from wide to long format, as dataframe
d <- data.frame(x=rep(1:100, 10), y=as.vector(m))

# compute lowess smooth, and plot as red line
lines(lowess(d$x, d$y, f=0.01), col='red', lwd=2)

So, it may be possible to augment the v.generalize command to work
with a collection of nodes (i.e. accept multiple vector inputs). Or,
an implementation of the lowess algorithm would be another approach.

Cheers,
Dylan

Ack.. I just realized that this won't work if the trail crosses over
itself, or where two x-coordinate values occur at a single
y-coordinate or visa versa. "un-raveling" the trails along some linear
path would be required to apply the lowess smoother. v.generalize or
Michael's suggestion may be the best approach.

Cheers,
Dylan