[GRASS5] CygWIN: MinGW compilation of R/GRASS interface

Mike, friends,

I'm bringing this back to grass5 to report on progress and to ask for
hints. R package building automation prefers to have the complete set of
files needed to build the package shared library in the source package -
reducing dependencies on library installation. So now I have a subset of
src/libes/gis/*.c and their *.h, together with some zlib and xdr headers
taken from the R src distribution in the interface package. It runs
correctly om Linux (RH 7.2), and handles files in the same way as the
native GRASS programs.

Building the package under MinGW also succeeds, with a few #ifdef's to
choose for example rename() instead of link(), and a fix to get round the
lack of getuid(). The remaing problems looked solvable but now I'm not
sure.

The scenario is that the MinGW R with the MinGW built GRASS package loaded
is running in Cygwin GRASS under Cygwin. Cygwin and MinGW see the file
system differently, but this can be corrected by prepending the
appropriate string to the values of $GISRC and $GISDBASE in env.c - (not
touching .grassrc5 allowing eg. system("r.mapcalc ...") to carry on
running. There is a case for some user level access to set the ENV
structure, or at least to retrieve the values present there. The second
issue was that MinGW (and thus R and the built GRASS package) write text
files with CR/NL pairs, not just NL, which makes reading everything - say
in f_format files, or site_lists files, fail. This is also fixable by
setting the mode of files opened to O_BINARY, or adding a "b" to the
fopened. Given this, the interface works for metadata and sites data.

Today's big disappointment is however that cell and fcell files are being
written wrongly. They are slightly larger than the "correct" ones. The
scenario is that R also uses xdr and zlib functions - on Linux the same
ones that GRASS finds during configure. Consequently, they (or some
version of them) are already "in" the R.dll that the GRASS.dll is
dynalycally loaded into, so all the dependencies and references are
satisfied. Under Linux they appear to generate the same file content as
CygWin GRASS - locations are mutually exchangable. My intuition is that
they do work cross-platform because R save objects (first xdr, then zlib)
do transfer from Win/MinGW to Linux, but the GRASS xdr/zlib don't - that
is Cygwin GRASS programs can't read what the MinGW R/GRASS interface has
written. The MinGW R/GRASS interface can't read then either, but
curiously, it reads the zlib'ed, xdr'ed FCELL layers created by GRASS
itself on whatever platform!

Because compression and XDR happen at the same places in the code for
writing FCELLs, I'd like to ask 1) if compression of cell and/or fcell
files can be turned off? 2) Beyond the #define DEBUG statements in
libes/gis files, does anyone have any experience of debugging XDR - if I
could turn off compression, I would have a chance of seeing if the problem
is incompatibility in zlib across platforms, in xdr across platforms, or
something else getting into the files (more ^M??); I would write a small
program just to read the fcell file to check its contents against the
values being written. My suspicions of a residual ^M or similar problem
are reinforced by CELL rasters also failing, and they don't use xdr or
zlib (do they?).

The aim is to make an interface package that passes the standard R tests,
and can be distributed and updated automatically - although keeping the
subset of libes/gis files in sync will probably have to be manually (the
changes are non-invasive with #ifdef's).

Grateful for any suggestions about where to look!

Roger

On Mon, 10 Feb 2003, Mike Thomas wrote:

Hi Roger, Glynn, Markus.

> > Mike Thomas <miketh@brisbane.paradigmgeo.com> has recently started
> > work on getting GRASS to compile with MinGW; however, I don't know how
> > far advanced this work is.

I compiled substantial portions of Grass using MinGW32 hosted under MSYS for
the configuration and build tools, but in doing so skirted over a number of
problems in libgis, most notably the socket communications stuff and the
database locking. My initial aim is to get libgis built with whatever
parts are needed for external language/library bindings.

The reason for that limited initial aim is that Grass is tightly bound to
shell scripts which would all need to be supplied under Windows presumably
by Cygwin or by a complete rewrite of the scripting glue as batch files or
Tcl or whatever - a rather off-putting task.

I am very tight for time at the moment due to the impending release of a
Geolog beta at work and of Maxima and GCL at home, so I am not devoting the
time that Grass deserves at present, I'm sorry to say.

> Since the interface only needs a subset of libgis.a and really doesn't
> need libdatetime.a, it is possible that the files in src/libes/gis and
> src/libes/datetime could be put within #ifndef RGRASS_INTERFACE #endif
> to use an #ELSE /* RGRASS_INTERFACE */ to choose just the functions and
> variables needed - to permit a subset of the files and headers to be
> distributed with the package. I think this should be updated manually -
> there is no good reason why the main source should be altered. I would
> welcome advice on whether the CygWin GRASS / R / R interface user
> community is sufficiently large for it to be sensible to use time on this!

If you couild tell me, Roger, which parts of libgis you need, I'll make sure
they are prioritised as I think that the Grass/R interface is a very worthy
cause. I suspect that you might already be OK unless the above two problem
areas are important.

Cheers

Mike Thomas.

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no

On Thu, Feb 13, 2003 at 10:58:10PM +0100, Roger Bivand wrote:

Today's big disappointment is however that cell and fcell files are being
written wrongly. They are slightly larger than the "correct" ones. The
scenario is that R also uses xdr and zlib functions - on Linux the same
ones that GRASS finds during configure. Consequently, they (or some
version of them) are already "in" the R.dll that the GRASS.dll is
dynalycally loaded into, so all the dependencies and references are
satisfied. Under Linux they appear to generate the same file content as
CygWin GRASS - locations are mutually exchangable. My intuition is that
they do work cross-platform because R save objects (first xdr, then zlib)
do transfer from Win/MinGW to Linux, but the GRASS xdr/zlib don't - that
is Cygwin GRASS programs can't read what the MinGW R/GRASS interface has
written. The MinGW R/GRASS interface can't read then either, but
curiously, it reads the zlib'ed, xdr'ed FCELL layers created by GRASS
itself on whatever platform!

Because compression and XDR happen at the same places in the code for
writing FCELLs, I'd like to ask 1) if compression of cell and/or fcell
files can be turned off? 2) Beyond the #define DEBUG statements in
libes/gis files, does anyone have any experience of debugging XDR - if I
could turn off compression, I would have a chance of seeing if the problem
is incompatibility in zlib across platforms, in xdr across platforms, or
something else getting into the files (more ^M??); I would write a small
program just to read the fcell file to check its contents against the
values being written. My suspicions of a residual ^M or similar problem
are reinforced by CELL rasters also failing, and they don't use xdr or
zlib (do they?).

New floating point rasters are created/opened with creat(). There aren't
any flag arguments, so it'd have to be changed to use open() iff
carriage returns are the problem and the O_BINARY is needed to change
that. If you redirect all G_open_fp_cell_new() to
G_open_fp_cell_new_uncompressed() in opencell.c, then all new FP
rasters should be created as "uncompressed". The default is pretty much
hardwired to use compression.

I suspect neither XDR nor ZLIB are at fault...

Luck.
--
echo ">gra.fcw@2ztr< eryyvZ .T pveR" | rot13 | reverse

On Thu, 13 Feb 2003, Eric G. Miller wrote:

On Thu, Feb 13, 2003 at 10:58:10PM +0100, Roger Bivand wrote:

> Today's big disappointment is however that cell and fcell files are being
> written wrongly. They are slightly larger than the "correct" ones. The
> scenario is that R also uses xdr and zlib functions - on Linux the same
> ones that GRASS finds during configure. Consequently, they (or some
> version of them) are already "in" the R.dll that the GRASS.dll is
> dynalycally loaded into, so all the dependencies and references are
> satisfied. Under Linux they appear to generate the same file content as
> CygWin GRASS - locations are mutually exchangable. My intuition is that
> they do work cross-platform because R save objects (first xdr, then zlib)
> do transfer from Win/MinGW to Linux, but the GRASS xdr/zlib don't - that
> is Cygwin GRASS programs can't read what the MinGW R/GRASS interface has
> written. The MinGW R/GRASS interface can't read then either, but
> curiously, it reads the zlib'ed, xdr'ed FCELL layers created by GRASS
> itself on whatever platform!
>
> Because compression and XDR happen at the same places in the code for
> writing FCELLs, I'd like to ask 1) if compression of cell and/or fcell
> files can be turned off? 2) Beyond the #define DEBUG statements in
> libes/gis files, does anyone have any experience of debugging XDR - if I
> could turn off compression, I would have a chance of seeing if the problem
> is incompatibility in zlib across platforms, in xdr across platforms, or
> something else getting into the files (more ^M??); I would write a small
> program just to read the fcell file to check its contents against the
> values being written. My suspicions of a residual ^M or similar problem
> are reinforced by CELL rasters also failing, and they don't use xdr or
> zlib (do they?).

New floating point rasters are created/opened with creat(). There aren't
any flag arguments, so it'd have to be changed to use open() iff
carriage returns are the problem and the O_BINARY is needed to change
that. If you redirect all G_open_fp_cell_new() to
G_open_fp_cell_new_uncompressed() in opencell.c, then all new FP
rasters should be created as "uncompressed". The default is pretty much
hardwired to use compression.

I suspect neither XDR nor ZLIB are at fault...

So did I, but there were too many "things" going on there. Eric's
intuition was quite right - setting the fd and null_fd in opencell.c to
mode O_BINARY fixed the issue - which will apply to the whole MinGW
porting problem - native under windows files need to be written and read
binary to make things portable.

Could any brave person willing to test the interface (preferably on a
fresh Cygwin, fresh GRASS for Cygwin, fresh R 1.6.2 (with some extra
packages - akima at least), contact me, and I'll let you have a link to
where I'll post the MinGW R/GRASS interface binary package? At the moment
the Cygwin prefix is hardwired to "c:/cygwin" - this will be made
user-settable soon.

Roger

Luck.

Yes, indeed! Luck plus grass5 means results!

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no

Roger Bivand wrote:

Grateful for any suggestions about where to look!

My initial suspicion is that MinGW's versions of the Unix I/O
functions (open, creat, read, write) perform LF<->CRLF conversion.

On Unix, these functions don't do any conversion.

On Cygwin, they may or may not do conversion, depending upon more
factors than I care to list here. However, if conversion gets
performed, you get a corrupted database.

There is some stuff at the bottom of MinGW's fcntl.h (_fmode,
_setmode()) that looks as if it might be relevant. However, I don't
have any documentation for MinGW, only the headers.

--
Glynn Clements <glynn.clements@virgin.net>

Hi all.

There is some stuff at the bottom of MinGW's fcntl.h (_fmode,
_setmode()) that looks as if it might be relevant.

These are indeed the animals you need. For example:

#ifdef _WIN32
_fmode = _O_BINARY;
#endif

Cheers

Mike Thomas

Hi,

I've submitted GRASS_0.2-2 to CRAN, thanks for your help. One problem that
I hit is that strlen() in G_store() usually crashes (MinGW and gcc RH
2.96) when passed a NULL string - a test might be an idea anyway in
libes/gis/store.c. I've put a full list of diffs, the source package, and
a Windows binary build on http://spatial.nhh.no/R/GRASS/index.html

The only additional functions are in libes/gis/env.c, to get and set the
init flag in open_env(). Beyond that, I've used the logic of Frank
Warmerdam's gisinit2, and make_loc from libgrass to create a temporary
GISDBASE to run examples in.

Roger

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no

Roger Bivand wrote:

I've submitted GRASS_0.2-2 to CRAN, thanks for your help. One problem that
I hit is that strlen() in G_store() usually crashes (MinGW and gcc RH
2.96) when passed a NULL string - a test might be an idea anyway in
libes/gis/store.c.

If you know of any specific cases where NULL might be passed to
G_store(), please report them, as they may indicate that a "not-NULL"
check should be added to the caller.

--
Glynn Clements <glynn.clements@virgin.net>

On Tue, 18 Feb 2003, Glynn Clements wrote:

Roger Bivand wrote:

> I've submitted GRASS_0.2-2 to CRAN, thanks for your help. One problem that
> I hit is that strlen() in G_store() usually crashes (MinGW and gcc RH
> 2.96) when passed a NULL string - a test might be an idea anyway in
> libes/gis/store.c.

If you know of any specific cases where NULL might be passed to
G_store(), please report them, as they may indicate that a "not-NULL"
check should be added to the caller.

In my R_G_init.c, in function R_G_get_gisrc_file()

      gisrc = G_store(G__get_gisrc_file());

where G__get_gisrc_file() is in my modified env.c:

char *G__get_gisrc_file (void)
{
    FILE *fd;
    if (!gisrc) {
        fd = open_env("r");
        if (fd == NULL) {
      G_warning("Failure opening GISRC file");
      G__set_gisrc_file(NULL);
      return gisrc;
  }
        fclose(fd);
    }

    return gisrc;
}

and the original is:

char *G__get_gisrc_file (void)
{
    return gisrc;
}

with:

static char *gisrc = NULL;

in both cases. Maybe it doesn't happen anywhere else, but I didn't expect
strlen() to be vulnerable - I thought it would give some sensible return
value (-1 is logical) when handed a NULL string.

Roger

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no

Roger Bivand wrote:

> > I've submitted GRASS_0.2-2 to CRAN, thanks for your help. One problem that
> > I hit is that strlen() in G_store() usually crashes (MinGW and gcc RH
> > 2.96) when passed a NULL string - a test might be an idea anyway in
> > libes/gis/store.c.
>
> If you know of any specific cases where NULL might be passed to
> G_store(), please report them, as they may indicate that a "not-NULL"
> check should be added to the caller.
>
In my R_G_init.c, in function R_G_get_gisrc_file()

      gisrc = G_store(G__get_gisrc_file());

For now, I suggest:

  gisrc = G__get_gisrc_file()
  if (gisrc) gisrc = G_store(gisrc);

If this was likely to be common, it wouldn't hurt to add:

  if (!s) return s;

to the beginning of G_store().

in both cases. Maybe it doesn't happen anywhere else, but I didn't expect
strlen() to be vulnerable - I thought it would give some sensible return
value (-1 is logical) when handed a NULL string.

strcpy() fails similarly, as do most of the string.h functions. BTW,
strdup() (which does the same thing as G_store()) also segfaults when
passed a NULL pointer.

--
Glynn Clements <glynn.clements@virgin.net>

Dear Roger,

here a first test report:
The installation of the new GRASS/R interface according to
http://grass.itc.it/statsgrass/grass_r_insthints.html

[for those being interested, it's simply

install.packages("GRASS")

]

works 100% ok on Linux.

Congratulations,

Markus

On Tue, Feb 18, 2003 at 11:15:53AM +0100, Roger Bivand wrote:

Hi,

I've submitted GRASS_0.2-2 to CRAN, thanks for your help. One problem that
I hit is that strlen() in G_store() usually crashes (MinGW and gcc RH
2.96) when passed a NULL string - a test might be an idea anyway in
libes/gis/store.c. I've put a full list of diffs, the source package, and
a Windows binary build on http://spatial.nhh.no/R/GRASS/index.html

The only additional functions are in libes/gis/env.c, to get and set the
init flag in open_env(). Beyond that, I've used the logic of Frank
Warmerdam's gisinit2, and make_loc from libgrass to create a temporary
GISDBASE to run examples in.

Roger

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand@nhh.no