Hello everybody
I've made an improvement to s.surf.idw to greatly improve its efficiency
when dealing with a large number of sites for interpolation, which I'm
thinking of commiting to CVS. The improvement is that when initially
reading the sites file it indexes all the sites according to what cell they
fall into, so that when it is later searching for the 12 nearest sites it
doesn't have to search through them all, just those indexed to the current
and nearby cells. Currently for each output cell it searches through all
the sites in the input file and calculates the distance to each one in
order to find the 12 nearest sites. With a few hundred thousand sites this
can cause it to take hours to run; with the improvements it takes only a
few minutes.
In doing this I've been thinking about the inconsistency between
s.surf.idw and s.surf.rst with regard to the sites used in the
interpolation. s.surf.rst only uses sites which fall inside the current
region, while s.surf.idw currently uses all the sites in the input sites
file no matter where they are. If there are a lot of sites outside the
region this can use a huge amount of memory and as part of my improvements
it only reads into memory sites that are within the region bounds (i.e.
consistent with s.surf.rst) but I wondered is this an acceptable change in
behaviour?
If sites outside the interpolation region are required to be
included it can be done by setting a bigger region and using a mask for
the interpolation region. I think this is how it would be done with
s.surf.rst. Incidentally, s.surf.rst has a special command-line option to
specify a maskmap while s.surf.idw simply honours the conventional GRASS
MASK raster layer. Is there a reason for this other inconsistency?
With a bit more work it might be possible to add a command-line switch
that enables the old behaviour (I would need to think about it), but I
don't really think this is necessary as it can be worked around (I will
explain this in the man page) and especially as the new behaviour would
make it consistent with s.surf.rst.
If there aren't any objections I will commit the improvements to CVS in a
day or two.
Paul
On Thu, Mar 27, 2003 at 11:06:34AM +0000, Paul Kelly wrote:
Hello everybody
I've made an improvement to s.surf.idw to greatly improve its efficiency
when dealing with a large number of sites for interpolation, which I'm
thinking of commiting to CVS. The improvement is that when initially
reading the sites file it indexes all the sites according to what cell they
fall into, so that when it is later searching for the 12 nearest sites it
doesn't have to search through them all, just those indexed to the current
and nearby cells. Currently for each output cell it searches through all
the sites in the input file and calculates the distance to each one in
order to find the 12 nearest sites. With a few hundred thousand sites this
can cause it to take hours to run; with the improvements it takes only a
few minutes.
I have thought about adding some general indexing for sites files to
libgis, as there a several programs where it could be useful. However,
it seemed with the move to erase the distinction between "sites" and
vector points in 5.1, that it would be a bit late to add such a thing
now. It seems a worthy addition to s.surf.idw though...
As to other inconsistency ... ? Generally raster producing operations
should respect the region and mask out of the box, possibly allowing
an override of the region (I don't see a good reason to override the
mask, generally, since the user can just turn it off). The notable
exception to this rule is import routines where the region of the
process should be made to match the input(s).
--
echo ">gra.fcw@2ztr< eryyvZ .T pveR" | rot13 | reverse
If sites outside the interpolation region are required to be
included it can be done by setting a bigger region and using a mask for
the interpolation region. I think this is how it would be done with
s.surf.rst. Incidentally, s.surf.rst has a special command-line option to
specify a maskmap while s.surf.idw simply honours the conventional GRASS
MASK raster layer. Is there a reason for this other inconsistency?
maskmap is there for the users (and mine) convenience for the cases
where I have a MASK set but my point data cover only a portion of the
area
within that mask, so I can mask the result without changing the MASK.
s.surf.rst should honor the MASK but I am not sure that it really does.
There are many other inconsistencies that need fixing (e.g. curvatures
from rst and r.slope.aspect have oposite signs).
Helena
With a bit more work it might be possible to add a command-line switch
that enables the old behaviour (I would need to think about it), but I
don't really think this is necessary as it can be worked around (I will
explain this in the man page) and especially as the new behaviour would
make it consistent with s.surf.rst.
If there aren't any objections I will commit the improvements to CVS in a
day or two.
Paul
_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5
On Thu, 27 Mar 2003, Helena Mitasova wrote:
maskmap is there for the users (and mine) convenience for the cases
where I have a MASK set but my point data cover only a portion of the
area
within that mask, so I can mask the result without changing the MASK.
s.surf.rst should honor the MASK but I am not sure that it really does.
So if a cell is masked by either the GRASS MASK or the raster file
specified with the 'maskmap=' option it should be masked in the output
raster, right?
I have committed a change to CVS so that this is how the 3 RST programs
(s.surf.rst, v.surf.rst and r.resamp.rst) behave. It involved a change to
the IL_create_bitmask() function so that it now also checks for the
current MASK as well as the maskmap option, and a fix to the way this
function is called from the user programs. I changed it so it returns a
pointer to the bitmask; this got round a bit of awkwardness with passing a
pointer to the bitmask to it when it really should be this function
deciding if it is going to create a bitmask or just return null.
It works fine on Linux but an earlier problem is still there on Irix:
instead of the masked cells having a value of null, when I query them with
d.what.rast I get something like
350932.20419847(E) 334209.47519084(N)
aster in PERMANENT, quant (2147483647)
aster in PERMANENT, actual (nan0x7ffffe00)
instead of null. I think it might be something to do with a mixture of
FCELL and DCELL references in the program, but it looks a bit complicated
and I didn't get anywhere with it.
A slightly related query:
I've managed to obtain a copy of 'Open Source GIS: A GRASS GIS Approach'
and it is a really useful and interesting book. I noticed a mention to the
s.surf.rstcv cross-validation module, and wonder if/when it is going to
appear in the GRASS source?
Paul