[GRASS-dev] winGRASS: debugging vector db issues

Hello Martin,

Markus tells me that you might participate in trying to debug the vector
db problems we are having in winGRASS.

Paul and I have both failed to find the reason up to now. It seems to me
that the (or at least the first...) problem is with the table description,
and within that most probably in the column description (
db__recv_*_definition in xdrtable.c and xdrcolumn.c in lib/db/dbmi_base/).
I don't know if Paul has come to the same conclusion.

For me the next steps are:

- checking whether the same happens with the other drivers (we have only
tested dbf so far)
- circonscribe the problem even further

I hope we can squish this one soon.

Moritz

Hi Moritz,

I promised Markus to look at this issue. But it takes time, BTW I have
never installed GRASS under MS Windows. In any case I can try to solve
the problem. Not sure about the result;-)

Martin

2007/2/14, Moritz Lennert <mlennert@club.worldonline.be>:

Hello Martin,

Markus tells me that you might participate in trying to debug the vector
db problems we are having in winGRASS.

Paul and I have both failed to find the reason up to now. It seems to me
that the (or at least the first...) problem is with the table description,
and within that most probably in the column description (
db__recv_*_definition in xdrtable.c and xdrcolumn.c in lib/db/dbmi_base/).
I don't know if Paul has come to the same conclusion.

For me the next steps are:

- checking whether the same happens with the other drivers (we have only
tested dbf so far)
- circonscribe the problem even further

I hope we can squish this one soon.

Moritz

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

On Wed, February 14, 2007 02:49, Moritz Lennert wrote:

Hello Martin,

Markus tells me that you might participate in trying to debug the vector
db problems we are having in winGRASS.

Paul and I have both failed to find the reason up to now. It seems to me
that the (or at least the first...) problem is with the table description,
and within that most probably in the column description (
db__recv_*_definition in xdrtable.c and xdrcolumn.c in lib/db/dbmi_base/).
I don't know if Paul has come to the same conclusion.

I think I have been able to get closer to the exact point of the problem.

Using db.select as the test case, the error seems to happen around a call
to xdr_int() in lib/db/dbmi_base/xdrstring.c.

Below you can see the debug output. Everything marked with 'Moritz:' are
my debugging statements. As you can see the problem seems to be the next
attempt of reading a string after the first column name was read. This
happens either on the second column name, if there are more than one
column, or on the attempt to read the table name in xdrtable.c after
having read the (unique) column name.

The problem is in line 105 of lib/db/dbmi_base/xdrstring.c:

if(!xdr_int (&xdrs, &len) || len <= 0) /* len will include the null byte */

len = 0 in this case.

IIUC, xdr_int() is an internal function of the XDR library, so maybe there
are some specificities with the Windows version of this library which we
need to take into account ?

Moritz

****Debug output******
GRASS 6.3.cvs (BELGIQUE):C:\grass\bin >db.select communes
D2/3: opendir c:/grass/grass-6.3.cvs\driver\db\

D2/3: opendir c:/grass/grass-6.3.cvs\driver\db\

D2/3: Moritz: before xdr_int
D2/3: Moritz: len = 38
D2/3: Moritz: before xdr_int in xdrstring.c
D2/3: Moritz: len = 1
D2/3: DBF: db__driver_open_database() name =
'$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/'
D3/3: tokens[0] = $GISDBASE
D3/3: -> C:/GRASSDATA
D3/3: tokens[1] = $LOCATION_NAME
D3/3: -> BELGIQUE
D3/3: tokens[2] = $MAPSET
D3/3: -> PERMANENT
D3/3: tokens[3] = dbf
D2/3: db.name = C:/GRASSDATA/BELGIQUE/PERMANENT/dbf/
D2/3: add_table(): table = communes name = communes.dbf
D2/3: add_table(): table = ssbel01 name = ssbel01.dbf
D2/3: Moritz: beginning of sel()
D2/3: Moritz: before xdr_int in xdrstring.c
D2/3: Moritz: len = 23
D3/3: SQL statement parsed successfully: select * from communes
D2/3: find_table(): table = communes
D2/3: ? communes
D2/3: load_table_head(): tab = 0,
C:/GRASSDATA/BELGIQUE/PERMANENT/dbf//communes.
dbf
D2/3: ncols = 3
D2/3: DBFFieldType 1
D3/3: add_column(): tab = 0, type = 2, name = COMCOD, width = 11, decimals
= 0
D2/3: DBFFieldType 1
D3/3: add_column(): tab = 0, type = 2, name = COMCOD2, width = 11,
decimals = 0
D2/3: DBFFieldType 1
D3/3: add_column(): tab = 0, type = 2, name = COMCOD3, width = 11,
decimals = 0
D3/3: Doing SQL command <4> on DBF table... (see include/sqlp.h)
D2/3: SELECT
D2/3: sel(): tab = 0
D2/3: load_table(): tab = 0
D2/3: ncols = 3 nrows = 622
D2/3: load_table_head(): tab = 0,
C:/GRASSDATA/BELGIQUE/PERMANENT/dbf//communes.
dbf
D2/3: Moritz xdrtable.c: Number of columns in table = 3
D2/3: Before DB_RECV_COLUMN_DEFINITION in xdrtable.c - col = 0
D2/3: Moritz: before DB_RECV_STRING 1 in xdrcolumns.c
D2/3: Moritz: before xdr_int in xdrstring.c
D2/3: Moritz: len = 7
D2/3: col = COMCOD
D2/3: Moritz: before DB_RECV_STRING 2 in xdrcolumns.c
D2/3: Moritz: before xdr_int in xdrstring.c
D2/3: Moritz: len = 1
D2/3: After DB_RECV_COLUMN_DEFINITION in xdrtable.c
D2/3: Before DB_RECV_COLUMN_DEFINITION in xdrtable.c - col = 1
D2/3: Moritz: before DB_RECV_STRING 1 in xdrcolumns.c
D2/3: Moritz: before xdr_int in xdrstring.c
D2/3: Moritz: in xdr_int in xdrstring.c
len = 0
dbmi: Protocol error
^^^^^^^^^^^^^^^^^^^^
D2/3: Moritz: after db_open_select_cursor()
D2/3: Moritz: after sel()
dbmi: Protocol error
D2/3: dbmi: Protocol error
save_table 0
D2/3: save_table 1
dbmi: Protocol error

On Tue, February 20, 2007 12:34, Moritz Lennert wrote:

On Wed, February 14, 2007 02:49, Moritz Lennert wrote:

Hello Martin,

Markus tells me that you might participate in trying to debug the vector
db problems we are having in winGRASS.

Paul and I have both failed to find the reason up to now. It seems to me
that the (or at least the first...) problem is with the table
description,
and within that most probably in the column description (
db__recv_*_definition in xdrtable.c and xdrcolumn.c in
lib/db/dbmi_base/).
I don't know if Paul has come to the same conclusion.

I think I have been able to get closer to the exact point of the problem.

Using db.select as the test case, the error seems to happen around a call
to xdr_int() in lib/db/dbmi_base/xdrstring.c.

Below you can see the debug output. Everything marked with 'Moritz:' are
my debugging statements. As you can see the problem seems to be the next
attempt of reading a string after the first column name was read. This
happens either on the second column name, if there are more than one
column, or on the attempt to read the table name in xdrtable.c after
having read the (unique) column name.

The problem is in line 105 of lib/db/dbmi_base/xdrstring.c:

if(!xdr_int (&xdrs, &len) || len <= 0) /* len will include the null byte
*/

len = 0 in this case.

IIUC, xdr_int() is an internal function of the XDR library, so maybe there
are some specificities with the Windows version of this library which we
need to take into account ?

Some more info:

Both with the dbf driver and the PostgreSQL driver, I cannot create a
table with db.execute. Again, this seems to hang at a call to xdr_int.

Moritz

On Tue, February 20, 2007 12:34, Moritz Lennert wrote:

On Wed, February 14, 2007 02:49, Moritz Lennert wrote:

Hello Martin,

Markus tells me that you might participate in trying to debug the vector
db problems we are having in winGRASS.

Paul and I have both failed to find the reason up to now. It seems to me
that the (or at least the first...) problem is with the table
description,
and within that most probably in the column description (
db__recv_*_definition in xdrtable.c and xdrcolumn.c in
lib/db/dbmi_base/).
I don't know if Paul has come to the same conclusion.

I think I have been able to get closer to the exact point of the problem.

Using db.select as the test case, the error seems to happen around a call
to xdr_int() in lib/db/dbmi_base/xdrstring.c.

Below you can see the debug output. Everything marked with 'Moritz:' are
my debugging statements. As you can see the problem seems to be the next
attempt of reading a string after the first column name was read. This
happens either on the second column name, if there are more than one
column, or on the attempt to read the table name in xdrtable.c after
having read the (unique) column name.

The problem is in line 105 of lib/db/dbmi_base/xdrstring.c:

if(!xdr_int (&xdrs, &len) || len <= 0) /* len will include the null byte
*/

len = 0 in this case.

IIUC, xdr_int() is an internal function of the XDR library, so maybe there
are some specificities with the Windows version of this library which we
need to take into account ?

Some more info. I found the following note on the qgis wiki where the code
of the xdr library comes from
(http://wiki.qgis.org/qgiswiki/BuildingWindowsBinaryOnLinux):

"TODO: Use DLL. Currently if DLL is used db drivers do not work because of
'\n' conversion (text mode expected). Find out how to force xdrlib to
expect binary mode when compiled as DLL."

Could this be the cause of our problems ? Paul, you patched the xdr code,
replacing bzero with memset and bcopy with memmove. Does this have
anything to do with the above ?

Moritz

On Tue, February 20, 2007 13:45, Moritz Lennert wrote:

Both with the dbf driver and the PostgreSQL driver, I cannot create a
table with db.execute. Again, this seems to hang at a call to xdr_int.

I have to take that back. I can create tables if I use

echo create table test (cat int, name varchar(10)) | db.execute

using db.execute input= doesn't work (it just hangs, without any error
message).

Moritz

On Sun, 25 Feb 2007, Moritz Lennert wrote:

Some more info. I found the following note on the qgis wiki where the code
of the xdr library comes from
(http://wiki.qgis.org/qgiswiki/BuildingWindowsBinaryOnLinux):

"TODO: Use DLL. Currently if DLL is used db drivers do not work because of
'\n' conversion (text mode expected). Find out how to force xdrlib to
expect binary mode when compiled as DLL."

Could this be the cause of our problems ? Paul, you patched the xdr code,
replacing bzero with memset and bcopy with memmove. Does this have
anything to do with the above ?

Moritz, WELL DONE! Using a static libxdr.a with nothing changed from the link Glynn posted (sorry can't remember where I downloaded it from) makes it work. Obviously we still need to work out why, but for now I've attached a working libxdr.a to this mail. If you delete libxdr.dll, replace it with libxdr.a, make clean, re-configure and re-compile I hope it will work for you too.
And then we're well on our way to having an Alpha-version non-Msys Windows binary distribution!

Paul

(attachments)

libxdr.a (14.1 KB)

Moritz Lennert wrote:

> IIUC, xdr_int() is an internal function of the XDR library, so maybe there
> are some specificities with the Windows version of this library which we
> need to take into account ?

Some more info. I found the following note on the qgis wiki where the code
of the xdr library comes from
(http://wiki.qgis.org/qgiswiki/BuildingWindowsBinaryOnLinux):

"TODO: Use DLL. Currently if DLL is used db drivers do not work because of
'\n' conversion (text mode expected). Find out how to force xdrlib to
expect binary mode when compiled as DLL."

Could this be the cause of our problems ?

EOL issues could be a problem.

AFAICT, each executable has to be linked with $(FMODE_OBJ)
(lib/gis/OBJ.<arch>/fmode.o) to force MSVCRT (open() etc) to operate
in binary mode (the default is to translate LF<->CRLF). Apparently,
this has to go into the executable; putting it into a DLL won't work.

An alternative is to redefine O_RDONLY/O_WRONLY/O_RDWR to include the
O_BINARY flag, e.g.:

  #define O_RDONLY (_O_RDONLY | _O_BINARY)
  #define O_WRONLY (_O_WRONLY | _O_BINARY)
  #define O_RDWR (_O_RDWR | _O_BINARY)

That won't work with stdin/stdout/stderr, though.

--
Glynn Clements <glynn@gclements.plus.com>

Paul Kelly wrote:

> Some more info. I found the following note on the qgis wiki where the code
> of the xdr library comes from
> (http://wiki.qgis.org/qgiswiki/BuildingWindowsBinaryOnLinux):
>
> "TODO: Use DLL. Currently if DLL is used db drivers do not work because of
> '\n' conversion (text mode expected). Find out how to force xdrlib to
> expect binary mode when compiled as DLL."
>
> Could this be the cause of our problems ? Paul, you patched the xdr code,
> replacing bzero with memset and bcopy with memmove. Does this have
> anything to do with the above ?

Moritz, WELL DONE! Using a static libxdr.a with nothing changed from the
link Glynn posted (sorry can't remember where I downloaded it from)

Ah. I solved the _fmode issues locally by patching my MinGW headers,
specifically the O_* definitions in <fcntl.h>:

  #define O_RDONLY (_O_RDONLY | _O_BINARY)
  #define O_WRONLY (_O_WRONLY | _O_BINARY)
  #define O_RDWR (_O_RDWR | _O_BINARY)

If libxdr.a is the one which I built, it will probably have binary I/O
hard-coded into it. We still need a more general solution, though.
AFAICT, every executable should get linked against $(FMODE_OBJ)
automatically on Windows; the rules in Module.make use it, and
Makefiles which have their own linking rules normally list it
explicitly.

--
Glynn Clements <glynn@gclements.plus.com>

On Mon, 26 Feb 2007, Glynn Clements wrote:

Paul Kelly wrote:

Moritz, WELL DONE! Using a static libxdr.a with nothing changed from the
link Glynn posted (sorry can't remember where I downloaded it from)

It was from the link posted here:
http://grass.itc.it/pipermail/grass-dev/2006-October/026857.html
to xdr-4.0-mingw2.tar.gz

Ah. I solved the _fmode issues locally by patching my MinGW headers,
specifically the O_* definitions in <fcntl.h>:

  #define O_RDONLY (_O_RDONLY | _O_BINARY)
  #define O_WRONLY (_O_WRONLY | _O_BINARY)
  #define O_RDWR (_O_RDWR | _O_BINARY)

If libxdr.a is the one which I built, it will probably have binary I/O
hard-coded into it. We still need a more general solution, though.
AFAICT, every executable should get linked against $(FMODE_OBJ)
automatically on Windows; the rules in Module.make use it, and
Makefiles which have their own linking rules normally list it
explicitly.

No, the libxdr.a used was one I built myself from the source at the above link. I notice it contains (in xdr_stdio.c):

#if defined(__CYGWIN32__) || defined(__MINGW32__)
#include <stdlib.h>
#include <fcntl.h>
unsigned int _CRT_fmode = _O_BINARY;
#endif

which probably explains why it works when compiled statically. But then all parts of GRASS should be linking against $(FMODE_OBJ) - perhaps parts of the database drivers aren't though? Will hopefully have time to look into it later.
Also attached is the diff of the changes I made to the above XDR package to get it to compile as a DLL.

Paul

(attachments)

xdr-diff.txt (3.52 KB)

On 26/02/07 12:51, Paul Kelly wrote:

On Mon, 26 Feb 2007, Glynn Clements wrote:

Paul Kelly wrote:

Moritz, WELL DONE! Using a static libxdr.a with nothing changed from the
link Glynn posted (sorry can't remember where I downloaded it from)

It was from the link posted here:
http://grass.itc.it/pipermail/grass-dev/2006-October/026857.html
to xdr-4.0-mingw2.tar.gz

Ah. I solved the _fmode issues locally by patching my MinGW headers,
specifically the O_* definitions in <fcntl.h>:

    #define O_RDONLY (_O_RDONLY | _O_BINARY)
    #define O_WRONLY (_O_WRONLY | _O_BINARY)
    #define O_RDWR (_O_RDWR | _O_BINARY)

If libxdr.a is the one which I built, it will probably have binary I/O
hard-coded into it. We still need a more general solution, though.
AFAICT, every executable should get linked against $(FMODE_OBJ)
automatically on Windows; the rules in Module.make use it, and
Makefiles which have their own linking rules normally list it
explicitly.

No, the libxdr.a used was one I built myself from the source at the above link. I notice it contains (in xdr_stdio.c):

#if defined(__CYGWIN32__) || defined(__MINGW32__)
#include <stdlib.h>
#include <fcntl.h>
unsigned int _CRT_fmode = _O_BINARY;
#endif

which probably explains why it works when compiled statically. But then all parts of GRASS should be linking against $(FMODE_OBJ) - perhaps parts of the database drivers aren't though? Will hopefully have time to look into it later.

AFAICT, all the db/drivers do, but not the db libs:

mlennert@geog-pc40:~/SRC/GRASS/CVS/grass6/lib/db$ grep -RI FMODE_OBJ *
mlennert@geog-pc40:~/SRC/GRASS/CVS/grass6/lib/db$

Should they ?

Moritz

Moritz Lennert wrote:

> which probably explains why it works when compiled statically. But then
> all parts of GRASS should be linking against $(FMODE_OBJ) - perhaps
> parts of the database drivers aren't though? Will hopefully have time to
> look into it later.

AFAICT, all the db/drivers do, but not the db libs:

mlennert@geog-pc40:~/SRC/GRASS/CVS/grass6/lib/db$ grep -RI FMODE_OBJ *
mlennert@geog-pc40:~/SRC/GRASS/CVS/grass6/lib/db$

Should they ?

There's no point adding it to a library; it has to go into the
executable.

--
Glynn Clements <glynn@gclements.plus.com>