[GRASS5] Re: [STATSGRASS] kriging, R and GRASS

(cc grass5)

On Mon, Jan 17, 2005 at 02:37:44PM +0100, Roger Bivand wrote:

On Mon, 17 Jan 2005, Roger Bivand wrote:

... and also wanted show that rastput() would scale well to much finer
grids - which it does - and found a bug in rastput() and other functions
that has led to the layer being written to the GRASS database with the
wrong window if system("g.region ... ") is used. This is the same bug that
affected gmeta() until Bodo Ahrens found it 18 months ago. The problem was
that the GRASS library function tries to be too clever (I think disk
access was very slow when it was written), and stores a local version of
the data in memory in a cache of sorts. It does not check, though, whether
the data on disk have changed - it just returns the cached values! Using a
lower-level function, and conditionally altering G__init_window () seems
to have fixed it. The new version should reach CRAN soon.

Roger,

is there any change which should go into GRASS-CVS?

Markus

_______________________________________________
statsgrass mailing list
statsgrass@grass.itc.it
http://grass.itc.it/mailman/listinfo/statsgrass

On Mon, 17 Jan 2005, Markus Neteler wrote:

(cc grass5)

On Mon, Jan 17, 2005 at 02:37:44PM +0100, Roger Bivand wrote:
> On Mon, 17 Jan 2005, Roger Bivand wrote:
>
> ... and also wanted show that rastput() would scale well to much finer
> grids - which it does - and found a bug in rastput() and other functions
> that has led to the layer being written to the GRASS database with the
> wrong window if system("g.region ... ") is used. This is the same bug that
> affected gmeta() until Bodo Ahrens found it 18 months ago. The problem was
> that the GRASS library function tries to be too clever (I think disk
> access was very slow when it was written), and stores a local version of
> the data in memory in a cache of sorts. It does not check, though, whether
> the data on disk have changed - it just returns the cached values! Using a
> lower-level function, and conditionally altering G__init_window () seems
> to have fixed it. The new version should reach CRAN soon.

Roger,

is there any change which should go into GRASS-CVS?

G__init_window () in src/libes/gis/window_map.c does the same as in my
local copy, that is calls G_get_window (), rather than G__get_window ().
What usually causes trouble for non-exiting use of libgis.a is the number
of places things are saved in a global variable (like G__.) which do not
get refreshed when the data on disk change. You can see it happening in
G_get_window () and its use of static int first (get_window.c), which uses
the cached dbwindow without checking the disk version write time. My guess
would be that I don't loose much in time by doing G__get_window () always,
but this time I hadn't realised that G__init_window () called in opening
and closing cells imposes the stale header, from G_get_window ().

Is checking the write time faster than just reading the current window? If
so, G_get_window () could be altered to record its last refresh time, and
if the window had been written since then, re-read. If not, G_get_window ()
should just read anyway. How would this play when the GRASS database is
across a network?

The changes I've made are in the modified 5.0.* libes/gis sources within
the R-GRASS interface package. I can run some diffs, most of the changes
are things that don't matter at all if one user is running a single GRASS
program at a time on the database, but do matter when read, cached data
are used stale (now window, mapset, other environment variables, some
other things too if I remember right). Changing g.region through system()
from the R prompt changes the disk file, but the loaded G__. stayed
blissfully unaware of the change with G_get_window (), only G__get_window ()
forced a read.

Roger

Markus

> _______________________________________________
> statsgrass mailing list
> statsgrass@grass.itc.it
> http://grass.itc.it/mailman/listinfo/statsgrass

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no

Roger Bivand wrote:

> > On Mon, 17 Jan 2005, Roger Bivand wrote:
> >
> > ... and also wanted show that rastput() would scale well to much finer
> > grids - which it does - and found a bug in rastput() and other functions
> > that has led to the layer being written to the GRASS database with the
> > wrong window if system("g.region ... ") is used. This is the same bug that
> > affected gmeta() until Bodo Ahrens found it 18 months ago. The problem was
> > that the GRASS library function tries to be too clever (I think disk
> > access was very slow when it was written), and stores a local version of
> > the data in memory in a cache of sorts. It does not check, though, whether
> > the data on disk have changed - it just returns the cached values! Using a
> > lower-level function, and conditionally altering G__init_window () seems
> > to have fixed it. The new version should reach CRAN soon.
>
> Roger,
>
> is there any change which should go into GRASS-CVS?

G__init_window () in src/libes/gis/window_map.c does the same as in my
local copy, that is calls G_get_window (), rather than G__get_window ().
What usually causes trouble for non-exiting use of libgis.a is the number
of places things are saved in a global variable (like G__.) which do not
get refreshed when the data on disk change. You can see it happening in
G_get_window () and its use of static int first (get_window.c), which uses
the cached dbwindow without checking the disk version write time. My guess
would be that I don't loose much in time by doing G__get_window () always,
but this time I hadn't realised that G__init_window () called in opening
and closing cells imposes the stale header, from G_get_window ().

Is checking the write time faster than just reading the current window? If
so, G_get_window () could be altered to record its last refresh time, and
if the window had been written since then, re-read. If not, G_get_window ()
should just read anyway. How would this play when the GRASS database is
across a network?

G_get_window() shouldn't be re-reading the WIND file. For the vast
majority of programs, it's essential that repeated calls to
G_get_window() always return the same data.

Changing it for the benefit of a long-running program will break
normal programs.

If you need to force a re-read, add a new function which clears the
G__.window_set flag, so that a subsequent call to G_get_window() will
re-read the WIND file.

That will ensure that existing code doesn't break if the WIND file
changes during execution.

--
Glynn Clements <glynn@gclements.plus.com>

On Wed, 19 Jan 2005, Glynn Clements wrote:

Roger Bivand wrote:

> > > On Mon, 17 Jan 2005, Roger Bivand wrote:
> > >
> > > ... and also wanted show that rastput() would scale well to much finer
> > > grids - which it does - and found a bug in rastput() and other functions
> > > that has led to the layer being written to the GRASS database with the
> > > wrong window if system("g.region ... ") is used. This is the same bug that
> > > affected gmeta() until Bodo Ahrens found it 18 months ago. The problem was
> > > that the GRASS library function tries to be too clever (I think disk
> > > access was very slow when it was written), and stores a local version of
> > > the data in memory in a cache of sorts. It does not check, though, whether
> > > the data on disk have changed - it just returns the cached values! Using a
> > > lower-level function, and conditionally altering G__init_window () seems
> > > to have fixed it. The new version should reach CRAN soon.
> >
> > Roger,
> >
> > is there any change which should go into GRASS-CVS?
>
> G__init_window () in src/libes/gis/window_map.c does the same as in my
> local copy, that is calls G_get_window (), rather than G__get_window ().
> What usually causes trouble for non-exiting use of libgis.a is the number
> of places things are saved in a global variable (like G__.) which do not
> get refreshed when the data on disk change. You can see it happening in
> G_get_window () and its use of static int first (get_window.c), which uses
> the cached dbwindow without checking the disk version write time. My guess
> would be that I don't loose much in time by doing G__get_window () always,
> but this time I hadn't realised that G__init_window () called in opening
> and closing cells imposes the stale header, from G_get_window ().
>
> Is checking the write time faster than just reading the current window? If
> so, G_get_window () could be altered to record its last refresh time, and
> if the window had been written since then, re-read. If not, G_get_window ()
> should just read anyway. How would this play when the GRASS database is
> across a network?

G_get_window() shouldn't be re-reading the WIND file. For the vast
majority of programs, it's essential that repeated calls to
G_get_window() always return the same data.

Changing it for the benefit of a long-running program will break
normal programs.

If you need to force a re-read, add a new function which clears the
G__.window_set flag, so that a subsequent call to G_get_window() will
re-read the WIND file.

That will ensure that existing code doesn't break if the WIND file
changes during execution.

Thanks. I'm reviewing my code against the 6.0.0beta lib/gis to see how
some small functions like this that the R interface needs could be merged
back. Something like this could be put in unconditionally too.

Roger

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no