I want to determine whether GRASS or R is best suited to
interpolating/extrapolating annual mean precipitation data from 58 reporting
stations (unevenly distributed within the county) across the county. Some
flavor of kriging would be applied to these data to illustrate a general
long-term pattern of rainfall.
Some years ago there was both a static display of a chemical constituent
in a river reach, a "heat map", (and an automation of temporal changes, if I
correctly recall) and I'm not finding this in the web site galleries.
While elevation could be included as an explanatory variable using
regression kriging my purpose is to illustrate county-wide mean annual rainfall
distribution over a 13 year period, not to interpolate values for specific,
unsampled locations.
Please provide some thoughts on the work flow to do this within GRASS. I'm
digging into the gstat docs to get a sense of how to do this within R (and I
have the rgrass7 package working well; it imported the GRASS county boundary
map which I converted to a SpatialPolygonDataFrame.)
The guidelines that I follow include: Regards, Micha
···
On 09/24/2018 06:32 PM, Rich Shepard wrote:
I want to determine whether GRASS or R is best suited to
interpolating/extrapolating annual mean precipitation data from 58 reporting
stations (unevenly distributed within the county) across the county. Some
flavor of kriging would be applied to these data to illustrate a general
long-term pattern of rainfall.
Some years ago there was both a static display of a chemical constituent
in a river reach, a “heat map”, (and an automation of temporal changes, if I
correctly recall) and I’m not finding this in the web site galleries.
While elevation could be included as an explanatory variable using
regression kriging my purpose is to illustrate county-wide mean annual rainfall
distribution over a 13 year period, not to interpolate values for specific,
unsampled locations.
Please provide some thoughts on the work flow to do this within GRASS. I’m
digging into the gstat docs to get a sense of how to do this within R (and I
have the rgrass7 package working well; it imported the GRASS county boundary
map which I converted to a SpatialPolygonDataFrame.)
Rainfall interpolation (of any kind) should be done only for long time periods = at least a full season. since you are looking at 13 years of data then this requirement is fulfilled.
The rules of thumb for “how many points” for kriging interpolation usually says > 30. So again you are fine.
The GRASS modules offer only ordinary kriging AFAIK, which might be appropriate in this case. But if you want to use Kriging with External Drift with the elevation as the secondary “trend” variable, I think you’ll need to go with with R. Having said that, I think that “the jury is still out” on whether elevation improves the interpolation or not.
On Mon, Sep 24, 2018 at 5:32 PM Rich Shepard <rshepard@appl-ecosys.com> wrote:
I want to determine whether GRASS or R is best suited to
interpolating/extrapolating annual mean precipitation data from 58 reporting
stations (unevenly distributed within the county) across the county. Some
flavor of kriging would be applied to these data to illustrate a general
long-term pattern of rainfall.
* Rainfall interpolation (of any kind) should be done only for long time
periods = at least a full season. since you are looking at 13 years of
data then this requirement is fulfilled.
* The rules of thumb for "how many points" for kriging interpolation
usually says > 30. So again you are fine.
* The GRASS modules offer only ordinary kriging AFAIK, which might be
appropriate in this case. But if you want to use Kriging with External
Drift with the elevation as the secondary "trend" variable, I think
you'll need to go with with R. Having said that, I think that "the
jury is still out" on whether elevation improves the interpolation or
not.
Micha,
Thanks for the insights. I think that OK would be adequate for now because
the climate is changing and this is a long-term project that will be
monitored and and adjusted as necessary,
The data (exported from an R data.frame) has six columns:
row number, coordinates (two columns in parentheses of easting and
northing), site_name, elevation, and mean precipitation. I'll change the
coordinates pair to separate easting and northing and could leave the row
48 numbers as categories unless it's advisable to allow GRASS to assign
those numbers.
Elevations range from ~45m to ~1200m, from the river at the bottom of the
valley floor to the watershed on top of the mountain range.
I'm working my way through the v.kriging manual page to understand all
options and the process for applying the module to my data. My question is
whether to do 2D or 3D kriging on this data set. Without any experience
applying this module I've no basis for making a decision because I've no
knowledge of which would be more appropriate. Your advice is needed.
An alternative to kriging is the v.surf.tps add-on [1]. It implements TPS
interpolation, with or without covariates. It is very similar to the
ANUSPLIN procedure used to create the WorldClim layers [2].
Pierre,
Thanks for the pointer and URLs. I'll certainly look carefully at them.
I think there is sufficient evidence to recommend using elevation as a secondary variable for spatial interpolation of precipitation using, for instance, Universal Kriging or spatial drift, or a method such as PRISM in the USA (http://www.prism.oregonstate.edu/) – PRISM estimates are used widely in the US by the National Weather Service and others. See:
On Tue, Sep 25, 2018 at 2:06 PM Thomas Adams <tea3rd@gmail.com> wrote:
Hello all,
I think there is sufficient evidence to recommend using elevation as a secondary variable for spatial interpolation of precipitation
as an alternative to kriging, TPS interpolation also allows to use covariables, e.g. with v.surf.tps or r.resamp.tps (both are addons)
Markus M
using, for instance, Universal Kriging or spatial drift, or a method such as PRISM in the USA (http://www.prism.oregonstate.edu/) – PRISM estimates are used widely in the US by the National Weather Service and others. See:
On Mon, Sep 24, 2018 at 2:19 PM Micha Silver <tsvibar@gmail.com> wrote:
Hi Rich
On 09/24/2018 06:32 PM, Rich Shepard wrote:
I want to determine whether GRASS or R is best suited to
interpolating/extrapolating annual mean precipitation data from 58 reporting
stations (unevenly distributed within the county) across the county. Some
flavor of kriging would be applied to these data to illustrate a general
long-term pattern of rainfall.
Some years ago there was both a static display of a chemical constituent
in a river reach, a “heat map”, (and an automation of temporal changes, if I
correctly recall) and I’m not finding this in the web site galleries.
While elevation could be included as an explanatory variable using
regression kriging my purpose is to illustrate county-wide mean annual rainfall
distribution over a 13 year period, not to interpolate values for specific,
unsampled locations.
Please provide some thoughts on the work flow to do this within GRASS. I’m
digging into the gstat docs to get a sense of how to do this within R (and I
have the rgrass7 package working well; it imported the GRASS county boundary
map which I converted to a SpatialPolygonDataFrame.)
The guidelines that I follow include:
Rainfall interpolation (of any kind) should be done only for long time periods = at least a full season. since you are looking at 13 years of data then this requirement is fulfilled.
The rules of thumb for “how many points” for kriging interpolation usually says > 30. So again you are fine.
The GRASS modules offer only ordinary kriging AFAIK, which might be appropriate in this case. But if you want to use Kriging with External Drift with the elevation as the secondary “trend” variable, I think you’ll need to go with with R. Having said that, I think that “the jury is still out” on whether elevation improves the interpolation or not.
Please provide some thoughts on the work flow to do this within GRASS. I'm
digging into the gstat docs to get a sense of how to do this within R (and
I have the rgrass7 package working well; it imported the GRASS county
boundary map which I converted to a SpatialPolygonDataFrame.)
I have run v.surf.tps on the point map of 58 reporting stations and am
having difficulty understanding the output. I suspect that because my data
does not consist of evenly spaced sparsed points, I've accepted the default
thinning of 1.5, and I don't (yet) have a DEM covering the entire county the
output is not as usable as it could be. (I've changed the region's
resolution to 100m.)
Regardless, I'm asking for input from those who have experience with the
geostatistical modules on how to determine the most appropriate one for a
given data set. I see that v.krige[1], v.kriging, and v.surf.tps are available
and my lack of experiences with all of these hinders my making an informed
decision.
Pointers to relevant documentation, as well as other advice, will be much
appreciated.
TIA,
Rich
[1] The rpy2 dependency for v.krige is not yet available for R-3.5.1.
Few (if any) of the spatial interpolation algorithms require “evenly spaced” data. Ultimately, you want spatial estimates that produce the least bias and where the errors are gaussian distributed, which can only be determined using some process of bootstrapping where stations are removed from the spatial interpolation process and are then used to evaluate error at those locations; and then, repeated, with a new set of stations. This should be done with several spatial interpolation algorithms that are generally known to produce the best estimates.
If you are trying to produce estimates over an entire county, you really need stations well outside the county boundary to avoid having problems near the county periphry (interior of the county boundary). GRASS and R play really well together through spgrass7, so that open the use of many R based geostatistical packages.
Few (if any) of the spatial interpolation algorithms require "evenly
spaced" data. Ultimately, you want spatial estimates that produce the
least bias and where the errors are gaussian distributed, which can only
be determined using some process of bootstrapping where stations are
removed from the spatial interpolation process and are then used to
evaluate error at those locations; and then, repeated, with a new set of
stations. This should be done with several spatial interpolation
algorithms that are generally known to produce the best estimates.
Tom,
Hadn't thought of bootstrapping since I'm only now digging into geostats.
If you are trying to produce estimates over an entire county, you really
need stations well outside the county boundary to avoid having problems
near the county periphry (interior of the county boundary).
Good point. Many of the reporting stations are close to the border since
that's where the larger cities are located. Changing the analytical boundary
to put more stations outside it makes good sense. I can use a subwatershed
(the small bounded area in the attached map), or another defined area. Your
thoughts on how to define that area will help.
GRASS and R play really well together through spgrass7, so that open the
use of many R based geostatistical packages.
I've used rgrass7 and was not aware of spgrass7. I'll install that
package.
Please provide some thoughts on the work flow to do this within GRASS. I’m
digging into the gstat docs to get a sense of how to do this within R (and
I have the rgrass7 package working well; it imported the GRASS county
boundary map which I converted to a SpatialPolygonDataFrame.)
I have run v.surf.tps on the point map of 58 reporting stations and am
having difficulty understanding the output.
With only 58 points, you will not get a lot of spatial detail, no matter what interpolation method you use. For example, using 58 points to create a surface with e.g. 1 million points is a waste of time: there is simply not enough spatial detail in the input. You can check, as you have done, by changing the resolution of the current region for the output and comparing the results. The only solution for more spatial detail is to get more input points.
Markus M
I suspect that because my data
does not consist of evenly spaced sparsed points, I’ve accepted the default
thinning of 1.5, and I don’t (yet) have a DEM covering the entire county the
output is not as usable as it could be. (I’ve changed the region’s
resolution to 100m.)
Regardless, I’m asking for input from those who have experience with the
geostatistical modules on how to determine the most appropriate one for a
given data set. I see that v.krige[1], v.kriging, and v.surf.tps are available
and my lack of experiences with all of these hinders my making an informed
decision.
Pointers to relevant documentation, as well as other advice, will be much
appreciated.
TIA,
Rich
[1] The rpy2 dependency for v.krige is not yet available for R-3.5.1.
With only 58 points, you will not get a lot of spatial detail, no matter
what interpolation method you use. For example, using 58 points to create
a surface with e.g. 1 million points is a waste of time: there is simply
not enough spatial detail in the input. You can check, as you have done,
by changing the resolution of the current region for the output and
comparing the results. The only solution for more spatial detail is to get
more input points.
Markus,
Thanks for the insight. Yes, there are too few data points for the size of
the area. So, I'll not use this map, but present a table of annual mean
values instead.
With only 58 points, you will not get a lot of spatial detail, no matter
what interpolation method you use. For example, using 58 points to create
a surface with e.g. 1 million points is a waste of time: there is simply
not enough spatial detail in the input. You can check, as you have done,
by changing the resolution of the current region for the output and
comparing the results. The only solution for more spatial detail is to get
more input points.
Markus,
Thanks for the insight. Yes, there are too few data points for the size of
the area. So, I’ll not use this map, but present a table of annual mean
values instead.
I want to determine whether GRASS or R is best suited to
interpolating/extrapolating annual mean precipitation data from 58 reporting
stations (unevenly distributed within the county) across the county. Some
flavor of kriging would be applied to these data to illustrate a general
long-term pattern of rainfall.
Some years ago there was both a static display of a chemical constituent
in a river reach, a “heat map”, (and an automation of temporal changes, if I
correctly recall) and I’m not finding this in the web site galleries.
While elevation could be included as an explanatory variable using
regression kriging my purpose is to illustrate county-wide mean annual rainfall
distribution over a 13 year period, not to interpolate values for specific,
unsampled locations.
Please provide some thoughts on the work flow to do this within GRASS. I’m
digging into the gstat docs to get a sense of how to do this within R (and I
have the rgrass7 package working well; it imported the GRASS county boundary
map which I converted to a SpatialPolygonDataFrame.)