[GRASS-dev] Some progress on Win32 attribute deadlock problem

Dear GRASS friends,

I know it is getting boring, but this problem with attribute data on
Win32 is totally wrecking the show for me, so I took a few more hours of
tedious Win32 (brrrr!) debugging to see what I can find.

Again, I used this data set:

ftp://ftp.rz.uni-kiel.de/pub/ufg/qgis-test.zip

and v.out.ogr to export all points (ca. 3000) in vector map 'dgm' incl.
attributes.
Deadlock occurs randomly on attribute data access, hanging v.out.ogr (or
any (!) other module that accesses attribute data -- I confirmed this at
least for DBF and sqlite tables).

Since debugging under Win32 is a nightmare, I decided to put in lots of
debugging messages by hand to try and locate the problem.

In unaltered form, v.out.ogr managed to export all points before hanging
in one(!) of 50 tries on a Machine with just 176 megs RAM.

In ca. 50% of the cases, hangs occured when Vect_read_line was called
(but there were several other funcs involved, too).

So I removed Vect_read_line from the main program loop and only called
it once before entering the loop to have a valid struct in memory to
work with.

Now, v.out.ogr finished in ca. 50% of the cases. So Vect_read_line must
be involved somehow, but it is not the original source of the problem
(read on).

Next, I reduced vector API calls to only Vect_read_line and
Vect_cat_get. Got rid of all attribute handling code. Now everything
worked fine. Out of 30 attempts, 30 worked. So these two funcs by
themselves don't cause any problems.

Next, I put some attribute handling code back in bit by bit to see when
the trouble would start again. It did not take long. Basically, managing
SQL strings with db_init_string and db_set_string works fine, but as
soon as the attribute database is queried (using db_open_select_cursor),
the lockups come back in full number!

SUMMARY

* The problem does not originate in a specific DB driver (such as
DBF.EXE), but is buried deep in the basic functions if the GRASS DBMI lib.

* The basic Vect_read_line and Vect_cat_get functions work fine on Win32
by themselves. Trouble starts when there is interaction with DBMI calls.

* The function db_open_select_cursor is at least one of the possible
culprits, but chances are that more functions are involved. In any case,
lockups occur more frequently when calls to Vect_read_line are done
often, suggesting that there is some unwholesome interaction or
competition for resources between Vect and DBMI libs.

* For some reason, this problem does not show up in Linux.
There is strong evidence that it is a system resource allocation problem
(mem, file handlers???), because lockups occur less frequently on
systems with more available RAM. The problem exists on both Windows 2000
and Windows XP systems (Vista untested).

For today, I am giving up.
At least now it seems to become clearer where to look for the culprit.
If any of you have any bright ideas, I'd be grateful to hear about them.

Best,

Benjamin

--
Benjamin Ducke, M.A.
Archäoinformatik
(Archaeoinformation Science)
Institut für Ur- und Frühgeschichte
(Inst. of Prehistoric and Historic Archaeology)
Christian-Albrechts-Universität zu Kiel
Johanna-Mestorf-Straße 2-6
D 24098 Kiel
Germany

Tel.: ++49 (0)431 880-3378 / -3379
Fax : ++49 (0)431 880-7300
www.uni-kiel.de/ufg

Benjamin Ducke wrote:

Next, I put some attribute handling code back in bit by bit to see when
the trouble would start again. It did not take long. Basically, managing
SQL strings with db_init_string and db_set_string works fine, but as
soon as the attribute database is queried (using db_open_select_cursor),
the lockups come back in full number!

SUMMARY

* The problem does not originate in a specific DB driver (such as
DBF.EXE), but is buried deep in the basic functions if the GRASS DBMI lib.

* The basic Vect_read_line and Vect_cat_get functions work fine on Win32
by themselves. Trouble starts when there is interaction with DBMI calls.

* The function db_open_select_cursor is at least one of the possible
culprits, but chances are that more functions are involved. In any case,
lockups occur more frequently when calls to Vect_read_line are done
often, suggesting that there is some unwholesome interaction or
competition for resources between Vect and DBMI libs.

* For some reason, this problem does not show up in Linux.
There is strong evidence that it is a system resource allocation problem
(mem, file handlers???), because lockups occur less frequently on
systems with more available RAM. The problem exists on both Windows 2000
and Windows XP systems (Vista untested).

For today, I am giving up.
At least now it seems to become clearer where to look for the culprit.
If any of you have any bright ideas, I'd be grateful to hear about them.

My first guesses would be:

1. The "#ifdef __MINGW32__" portion of lib/db/dbmi_client/start.c
2. The Windows port of the RPC/XDR library.

If the behaviour is non-deterministic (i.e. you don't get exactly the
same behaviour from consecutive runs), I would suspect that it's
timing related. Checking that incomplete read/write operations are
correctly handled might be a good place to start.

--
Glynn Clements <glynn@gclements.plus.com>

If any of you have any bright ideas, I'd be grateful to hear about them.

My first guesses would be:

1. The "#ifdef __MINGW32__" portion of lib/db/dbmi_client/start.c
2. The Windows port of the RPC/XDR library.

If the behaviour is non-deterministic (i.e. you don't get exactly the
same behaviour from consecutive runs), I would suspect that it's
timing related. Checking that incomplete read/write operations are
correctly handled might be a good place to start.

--
Glynn Clements <glynn@gclements.plus.com>

Interesting. I've had some other folks suggest a timing/priority problem,
too. I think we are getting closer here. I was going to look into XDR lib
on a hunch, too.
How would you advice to check for incomplete read/write ops: keep a count
of open files and compare it to the number of closed files once for every
run through the main export loop in v.out.ogr? I haven't done a lot of
system level programming for a long time so I am somewhat out of practice.

Cheers,

Benjamin

benjamin.ducke@ufg.uni-kiel.de wrote:

>> If any of you have any bright ideas, I'd be grateful to hear about them.
>
> My first guesses would be:
>
> 1. The "#ifdef __MINGW32__" portion of lib/db/dbmi_client/start.c
> 2. The Windows port of the RPC/XDR library.
>
> If the behaviour is non-deterministic (i.e. you don't get exactly the
> same behaviour from consecutive runs), I would suspect that it's
> timing related. Checking that incomplete read/write operations are
> correctly handled might be a good place to start.

Interesting. I've had some other folks suggest a timing/priority problem,
too. I think we are getting closer here. I was going to look into XDR lib
on a hunch, too.
How would you advice to check for incomplete read/write ops: keep a count
of open files and compare it to the number of closed files once for every
run through the main export loop in v.out.ogr?

No, that won't help.

I'm referring to the situation where read/write return a short count,
e.g. "write(fd, buf, count) < count". But, I've remembered that XDR
uses stdio rather than POSIX I/O, and I don't think that fread/fwrite
can return a short count (except for EOF).

According to the MSVCRT documentation, the O_NOINHERIT flag should be
used when using _dup2(), e.g.:

  _pipe(p1, 250000, _O_BINARY|_O_NOINHERIT)

Another suggestion: try changing the size passed to the _pipe()
function in dbmi_client/start.c. If that affects the tendency to
deadlock, it strongly suggests that the issue is related to the way
that a full pipe is handled.

Beyond that, the only thing which I can suggest is to instrument the
XDR code with debug code to log all I/O operations (including the data
which is read/written).

--
Glynn Clements <glynn@gclements.plus.com>

OK, I tried some of the suggestion Glynn made below:

I'm referring to the situation where read/write return a short count,
e.g. "write(fd, buf, count) < count". But, I've remembered that XDR
uses stdio rather than POSIX I/O, and I don't think that fread/fwrite
can return a short count (except for EOF).

According to the MSVCRT documentation, the O_NOINHERIT flag should be
used when using _dup2(), e.g.:

  _pipe(p1, 250000, _O_BINARY|_O_NOINHERIT)

Another suggestion: try changing the size passed to the _pipe()
function in dbmi_client/start.c. If that affects the tendency to
deadlock, it strongly suggests that the issue is related to the way
that a full pipe is handled.

Beyond that, the only thing which I can suggest is to instrument the
XDR code with debug code to log all I/O operations (including the data
which is read/written).

After hundreds of test runs with different Windows versions, these
are my conclusions:

The problem has to do with the pipe mechanism in Windows.
I tried changing the pipe size as suggest, using extremely small (25)
and extremely high (250000000) values. On Windows 2000, with
the very small value, no module run makes it past 33 percent. So there
is a clear correlation. As soon as I set it to some "sane" value
(at least 25000), I get the same situation: ca. 4-6 out of 50 runs
complete. Increasing the value from here won't make a difference,
the differences are always within measuring precision.

This is no surprise, since the comment in dbmi_client/start.c states
that the pipe buffer value is not directly related to the pipe size.
Apparently, Windows choose some fixed value as soon as the size
is greated than some threshold. The same thing happens when I set
the size to "0".

However, the fact that I can block the piping effectively with
very small values leaves me believing that this is, as Glynn
suggests, the source of the troubles:
A full pipe gets stuck and no process ever takes anything out of
it to make some room, so the next bit of data cannot be pushed into
it. Puller waits for pusher, pusher never pushes, because nothing
gets pulled = deadlock. (I think...)

Another thing makes me believe that Windows itself is the culprit
here: I tested the same stuff on a Windows XP SP2 system, clean
install from scratch. On this system, almost all the runs (97%)
finished cleanly!

Obviously MS did some improvements to process communication in that
release ...

Setting the _NO_INHERIT flag makes no difference.

So, how are we going to go ahead?

Best,

Benjamin

--
Benjamin Ducke, M.A.
Archäoinformatik
(Archaeoinformation Science)
Institut für Ur- und Frühgeschichte
(Inst. of Prehistoric and Historic Archaeology)
Christian-Albrechts-Universität zu Kiel
Johanna-Mestorf-Straße 2-6
D 24098 Kiel
Germany

Tel.: ++49 (0)431 880-3378 / -3379
Fax : ++49 (0)431 880-7300
www.uni-kiel.de/ufg

Benjamin Ducke wrote:

> I'm referring to the situation where read/write return a short count,
> e.g. "write(fd, buf, count) < count". But, I've remembered that XDR
> uses stdio rather than POSIX I/O, and I don't think that fread/fwrite
> can return a short count (except for EOF).
>
> According to the MSVCRT documentation, the O_NOINHERIT flag should be
> used when using _dup2(), e.g.:
>
> _pipe(p1, 250000, _O_BINARY|_O_NOINHERIT)
>
> Another suggestion: try changing the size passed to the _pipe()
> function in dbmi_client/start.c. If that affects the tendency to
> deadlock, it strongly suggests that the issue is related to the way
> that a full pipe is handled.
>
> Beyond that, the only thing which I can suggest is to instrument the
> XDR code with debug code to log all I/O operations (including the data
> which is read/written).

After hundreds of test runs with different Windows versions, these
are my conclusions:

The problem has to do with the pipe mechanism in Windows.
I tried changing the pipe size as suggest, using extremely small (25)
and extremely high (250000000) values. On Windows 2000, with
the very small value, no module run makes it past 33 percent. So there
is a clear correlation. As soon as I set it to some "sane" value
(at least 25000), I get the same situation: ca. 4-6 out of 50 runs
complete. Increasing the value from here won't make a difference,
the differences are always within measuring precision.

This is no surprise, since the comment in dbmi_client/start.c states
that the pipe buffer value is not directly related to the pipe size.
Apparently, Windows choose some fixed value as soon as the size
is greated than some threshold. The same thing happens when I set
the size to "0".

However, the fact that I can block the piping effectively with
very small values leaves me believing that this is, as Glynn
suggests, the source of the troubles:
A full pipe gets stuck and no process ever takes anything out of
it to make some room, so the next bit of data cannot be pushed into
it. Puller waits for pusher, pusher never pushes, because nothing
gets pulled = deadlock. (I think...)

The reader won't be waiting for the writer if it already has a full
pipe of data available.

The usual reason for co-processes (two processes connected via a
read/write pair of pipes) to deadlock is that both are trying to write
to full pipes. Neither process can continue until their write pipe is
drained, which won't happen as each process is blocked.

[The situation where both are blocked trying to read from empty pipes
is theoretically possible but uncommon in practice, as it indicates a
fundamental design error, whereas the both-blocked-on-write case is
usually due to a relatively simple oversight.]

Briefly, the drivers use a synchronous RPC mechanism:

  Phase Client Driver
  
  1. Client sends request write(1) read(1)
  2. Driver reads request read(2) read(1)
  3. Driver processes request read(2) busy(2)
  4. Driver sends response read(2) write(3)
  5. Client reads response read(2) read(1)

Client:

  do_request(...)
  {
  1: send_request();
  2: read_response();
  }

Driver:

  main_loop(...)
  {
    while (!eof(...))
    {
    1: read_request();
    2: process_request();
    3: send_response();
    }
  }

For this mechanism to work, each process (client and driver) must read
exactly what it is sent before proceeding, and must send exactly what
the client expects to receive. The driver must not send the response
before it has read the entire request, even if it intends to discard
it. Similarly, the client must read any response before sending the
next request; it can't send a bunch of requests then read all of the
responses later.

Having said all of that, there can't be any fundamental design
problems if it works fine on Unix. One possibility is incorrect
handling of an error condition which only occurs on Windows.

It's also possible that the Windows stdio implementation doesn't like
_pipe(). If it doesn't handle short reads/writes correctly, that won't
affect files, and it probably won't affect pipes which never fill up,
but it may fall down on a full pipe.

Another thing makes me believe that Windows itself is the culprit
here: I tested the same stuff on a Windows XP SP2 system, clean
install from scratch. On this system, almost all the runs (97%)
finished cleanly!

Obviously MS did some improvements to process communication in that
release ...

Not necessarily. If the issue is related to timing, it could just be
that everything runs more smoothly due to the clean install (rather
than SP2).

Setting the _NO_INHERIT flag makes no difference.

So, how are we going to go ahead?

Figure out how to debug the processes. If you can't get gdb to work, I
can only suggest logging every significant event at the lowest level,
i.e. log every read/write operation: the arguments, the return code,
and the complete data (i.e. the buffer contents before read and after
write). This is all done in the RPC/XDR library, in xdr_stdio.c. It
will probably help to also log the beginning/end of each procedure
call (i.e. lib/db/dbmi_base/xdrprocedure.c).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn,

my last posting to the GRASS dev list overlapped with yours.
I'll try to find the time and do as you suggested.

Best,

Ben

Glynn Clements wrote:

Benjamin Ducke wrote:

I'm referring to the situation where read/write return a short count,
e.g. "write(fd, buf, count) < count". But, I've remembered that XDR
uses stdio rather than POSIX I/O, and I don't think that fread/fwrite
can return a short count (except for EOF).

According to the MSVCRT documentation, the O_NOINHERIT flag should be
used when using _dup2(), e.g.:

  _pipe(p1, 250000, _O_BINARY|_O_NOINHERIT)

Another suggestion: try changing the size passed to the _pipe()
function in dbmi_client/start.c. If that affects the tendency to
deadlock, it strongly suggests that the issue is related to the way
that a full pipe is handled.

Beyond that, the only thing which I can suggest is to instrument the
XDR code with debug code to log all I/O operations (including the data
which is read/written).

After hundreds of test runs with different Windows versions, these
are my conclusions:

The problem has to do with the pipe mechanism in Windows.
I tried changing the pipe size as suggest, using extremely small (25)
and extremely high (250000000) values. On Windows 2000, with
the very small value, no module run makes it past 33 percent. So there
is a clear correlation. As soon as I set it to some "sane" value
(at least 25000), I get the same situation: ca. 4-6 out of 50 runs
complete. Increasing the value from here won't make a difference,
the differences are always within measuring precision.

This is no surprise, since the comment in dbmi_client/start.c states
that the pipe buffer value is not directly related to the pipe size.
Apparently, Windows choose some fixed value as soon as the size
is greated than some threshold. The same thing happens when I set
the size to "0".

However, the fact that I can block the piping effectively with
very small values leaves me believing that this is, as Glynn
suggests, the source of the troubles:
A full pipe gets stuck and no process ever takes anything out of
it to make some room, so the next bit of data cannot be pushed into
it. Puller waits for pusher, pusher never pushes, because nothing
gets pulled = deadlock. (I think...)

The reader won't be waiting for the writer if it already has a full
pipe of data available.

The usual reason for co-processes (two processes connected via a
read/write pair of pipes) to deadlock is that both are trying to write
to full pipes. Neither process can continue until their write pipe is
drained, which won't happen as each process is blocked.

[The situation where both are blocked trying to read from empty pipes
is theoretically possible but uncommon in practice, as it indicates a
fundamental design error, whereas the both-blocked-on-write case is
usually due to a relatively simple oversight.]

Briefly, the drivers use a synchronous RPC mechanism:

  Phase Client Driver
  
  1. Client sends request write(1) read(1)
  2. Driver reads request read(2) read(1)
  3. Driver processes request read(2) busy(2)
  4. Driver sends response read(2) write(3)
  5. Client reads response read(2) read(1)

Client:

  do_request(...)
  {
  1: send_request();
  2: read_response();
  }

Driver:

  main_loop(...)
  {
    while (!eof(...))
    {
    1: read_request();
    2: process_request();
    3: send_response();
    }
  }

For this mechanism to work, each process (client and driver) must read
exactly what it is sent before proceeding, and must send exactly what
the client expects to receive. The driver must not send the response
before it has read the entire request, even if it intends to discard
it. Similarly, the client must read any response before sending the
next request; it can't send a bunch of requests then read all of the
responses later.

Having said all of that, there can't be any fundamental design
problems if it works fine on Unix. One possibility is incorrect
handling of an error condition which only occurs on Windows.

It's also possible that the Windows stdio implementation doesn't like
_pipe(). If it doesn't handle short reads/writes correctly, that won't
affect files, and it probably won't affect pipes which never fill up,
but it may fall down on a full pipe.

Another thing makes me believe that Windows itself is the culprit
here: I tested the same stuff on a Windows XP SP2 system, clean
install from scratch. On this system, almost all the runs (97%)
finished cleanly!

Obviously MS did some improvements to process communication in that
release ...

Not necessarily. If the issue is related to timing, it could just be
that everything runs more smoothly due to the clean install (rather
than SP2).

Setting the _NO_INHERIT flag makes no difference.

So, how are we going to go ahead?

Figure out how to debug the processes. If you can't get gdb to work, I
can only suggest logging every significant event at the lowest level,
i.e. log every read/write operation: the arguments, the return code,
and the complete data (i.e. the buffer contents before read and after
write). This is all done in the RPC/XDR library, in xdr_stdio.c. It
will probably help to also log the beginning/end of each procedure
call (i.e. lib/db/dbmi_base/xdrprocedure.c).

--
Benjamin Ducke, M.A.
Archäoinformatik
(Archaeoinformation Science)
Institut für Ur- und Frühgeschichte
(Inst. of Prehistoric and Historic Archaeology)
Christian-Albrechts-Universität zu Kiel
Johanna-Mestorf-Straße 2-6
D 24098 Kiel
Germany

Tel.: ++49 (0)431 880-3378 / -3379
Fax : ++49 (0)431 880-7300
www.uni-kiel.de/ufg