[GRASS5] Driver Update

I've committed a batch of changes to the display driver architecture.
This consists mainly of merging the various versions of
driver-specific or transport-specific files.

Given the significance of the changes[1], I would appreciate it if the
changes can be tested on as many different platforms as possible. In
particular, feedback from anyone who uses the FIFO option would be
appreciated.

[1] If a change breaks r.out.foo, it doesn't matter unless you
actually use that program. If I've broken the display drivers, it's
going to affect a lot of people.

The changes are primarily in src/display/devices, although
src/libes/raster/io.c has also been changed.

There is no longer a README.xdriver; instead, code which is specific
to a particular transport (sockets or FIFOs; IPC has been removed) is
conditionalised using "#if[n]def USE_G_SOCKS".

Also, the different drivers (XDRIVER, CELL, PNGdriver, HTMLMAP) no
longer have their own versions of connect.c or SWITCHER.c (in fact,
SWITCHER.c is no more, the code being distributed into several new
files in src/display/devices/lib).

*** IMPORTANT NOTE ***

Anyone doing "cvs update" should perform the following step (or
similar; "make distclean" will also do the job):

  find src/display/devices -name '*.a' -exec rm {} \;

Without the above, you may end up running the old version of main()
from a copy of SWITCHER.o left behind in driverlib.a.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Glynn,

On first try with the new drivers on Cygwin, I get an access violation during
"d.mon x0" or "d.mon select=x0". The monitor starts if I do "d.mon -s start=x0",
but then crashes on "d.mon select=x0".

In my initial look at the code, I noticed that for libes/raster/io.c outbuf was
defined as 2048 (BUFFERSIZ), while in display/devices/lib/connect.c inbuf is set
to 4096. I changed inbuf to 2048 and recompiled in display/devices/lib and
display/devices/XDRIVER, but still got the same error.

I'll look a little more to see if I can see where the error is coming from (but
debugging on Cygwin is tedious at best - and only for console apps). From my
experience, access violations come from initialized variables or array/struct
overflows. Any idea where I should look first?

Malcolm

Glynn Clements wrote:

I've committed a batch of changes to the display driver architecture.
This consists mainly of merging the various versions of
driver-specific or transport-specific files.

Given the significance of the changes[1], I would appreciate it if the
changes can be tested on as many different platforms as possible. In
particular, feedback from anyone who uses the FIFO option would be
appreciated.

[1] If a change breaks r.out.foo, it doesn't matter unless you
actually use that program. If I've broken the display drivers, it's
going to affect a lot of people.

The changes are primarily in src/display/devices, although
src/libes/raster/io.c has also been changed.

There is no longer a README.xdriver; instead, code which is specific
to a particular transport (sockets or FIFOs; IPC has been removed) is
conditionalised using "#if[n]def USE_G_SOCKS".

Also, the different drivers (XDRIVER, CELL, PNGdriver, HTMLMAP) no
longer have their own versions of connect.c or SWITCHER.c (in fact,
SWITCHER.c is no more, the code being distributed into several new
files in src/display/devices/lib).

*** IMPORTANT NOTE ***

Anyone doing "cvs update" should perform the following step (or
similar; "make distclean" will also do the job):

        find src/display/devices -name '*.a' -exec rm {} \;

Without the above, you may end up running the old version of main()
from a copy of SWITCHER.o left behind in driverlib.a.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Malcolm Blue wrote:

On first try with the new drivers on Cygwin, I get an access violation during
"d.mon x0" or "d.mon select=x0". The monitor starts if I do "d.mon -s start=x0",
but then crashes on "d.mon select=x0".

Is there anything to indicate whether it's the monitor (XDRIVER) or
d.mon/mon.select that's crashing?

In my initial look at the code, I noticed that for libes/raster/io.c outbuf was
defined as 2048 (BUFFERSIZ), while in display/devices/lib/connect.c inbuf is set
to 4096. I changed inbuf to 2048 and recompiled in display/devices/lib and
display/devices/XDRIVER, but still got the same error.

I'll look a little more to see if I can see where the error is coming from (but
debugging on Cygwin is tedious at best - and only for console apps). From my
experience, access violations come from initialized variables or array/struct
overflows. Any idea where I should look first?

The first thing is to determine which program is crashing.

If it's mon.select, try checking out the old version of
src/libes/raster/io.c (use the "devices_cleanup_20000420" tag). If
that fixes it, the problem lies there. If it doesn't, or if it's
XDRIVER which is crashing, there are more files to consider.

A couple of questions: First, did you delete the old driverlib.a
libraries? Just doing "cvs update ; make" will have problems. Second,
were there any warnings relating to src/display/devices/*?

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Glynn,

First, I can tell you that it is the XDRIVER program that is crashing. Second, it
seems to be in the initial communication from the select command.

Also, I had created a whole new grass directory structure, so there is no conflict of
old and new programs or libs.

So far I've determined that the problem is in
    R_open_driver() in select.c,
        sync_driver() in io.c,
           read() call in sync_driver().

I had tested the reverse of your suggestion. I tried the old XDRIVER executable with
the new build. That had the same result as before, so I thought it was in io.c. That's
why I tracked the problem down to the first read() in the sync function. So after
that, I started looking for mismatches between the routines. I haven't found any.

After reading your response, I tried replacing io.c and rebuilding rasterlib &
mon.select. That had the same result as before.

The monitor (XDRIVER) is running until I do the select with any of these combinations.
Were there other changes that I can look at? Did you change any of the sockets calls,
or just where thay are called from?

I'm also going to reinstall my lat bindist to make sure that this isn't something I've
caused separately from your fixes.

Malcolm

Hopefully we'll know soon.

Thanks,

Malcolm

Glynn Clements wrote:

Malcolm Blue wrote:

> On first try with the new drivers on Cygwin, I get an access violation during
> "d.mon x0" or "d.mon select=x0". The monitor starts if I do "d.mon -s start=x0",
> but then crashes on "d.mon select=x0".

Is there anything to indicate whether it's the monitor (XDRIVER) or
d.mon/mon.select that's crashing?

> In my initial look at the code, I noticed that for libes/raster/io.c outbuf was
> defined as 2048 (BUFFERSIZ), while in display/devices/lib/connect.c inbuf is set
> to 4096. I changed inbuf to 2048 and recompiled in display/devices/lib and
> display/devices/XDRIVER, but still got the same error.
>
> I'll look a little more to see if I can see where the error is coming from (but
> debugging on Cygwin is tedious at best - and only for console apps). From my
> experience, access violations come from initialized variables or array/struct
> overflows. Any idea where I should look first?

The first thing is to determine which program is crashing.

If it's mon.select, try checking out the old version of
src/libes/raster/io.c (use the "devices_cleanup_20000420" tag). If
that fixes it, the problem lies there. If it doesn't, or if it's
XDRIVER which is crashing, there are more files to consider.

A couple of questions: First, did you delete the old driverlib.a
libraries? Just doing "cvs update ; make" will have problems. Second,
were there any warnings relating to src/display/devices/*?

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Glynn,

Sorry, it wasn't your code change that was causing the problem. It was my system.

XDRIVER was crashing because of some testing I was doing to get postgresql running. I
reversed the cygwin system changes I had made, did a cold reboot and the grass monitor
runs.

I'll do some more testing now. So far, I've just verified that a few of the display
commands work on the XDRIVER (d.vect, d.rast and d.site).

Malcolm

Malcolm Blue wrote:

Glynn,

First, I can tell you that it is the XDRIVER program that is crashing. Second, it
seems to be in the initial communication from the select command.

Also, I had created a whole new grass directory structure, so there is no conflict of
old and new programs or libs.

So far I've determined that the problem is in
    R_open_driver() in select.c,
        sync_driver() in io.c,
           read() call in sync_driver().

I had tested the reverse of your suggestion. I tried the old XDRIVER executable with
the new build. That had the same result as before, so I thought it was in io.c. That's
why I tracked the problem down to the first read() in the sync function. So after
that, I started looking for mismatches between the routines. I haven't found any.

After reading your response, I tried replacing io.c and rebuilding rasterlib &
mon.select. That had the same result as before.

The monitor (XDRIVER) is running until I do the select with any of these combinations.
Were there other changes that I can look at? Did you change any of the sockets calls,
or just where thay are called from?

I'm also going to reinstall my lat bindist to make sure that this isn't something I've
caused separately from your fixes.

Malcolm

Hopefully we'll know soon.

Thanks,

Malcolm

Glynn Clements wrote:

> Malcolm Blue wrote:
>
> > On first try with the new drivers on Cygwin, I get an access violation during
> > "d.mon x0" or "d.mon select=x0". The monitor starts if I do "d.mon -s start=x0",
> > but then crashes on "d.mon select=x0".
>
> Is there anything to indicate whether it's the monitor (XDRIVER) or
> d.mon/mon.select that's crashing?
>
> > In my initial look at the code, I noticed that for libes/raster/io.c outbuf was
> > defined as 2048 (BUFFERSIZ), while in display/devices/lib/connect.c inbuf is set
> > to 4096. I changed inbuf to 2048 and recompiled in display/devices/lib and
> > display/devices/XDRIVER, but still got the same error.
> >
> > I'll look a little more to see if I can see where the error is coming from (but
> > debugging on Cygwin is tedious at best - and only for console apps). From my
> > experience, access violations come from initialized variables or array/struct
> > overflows. Any idea where I should look first?
>
> The first thing is to determine which program is crashing.
>
> If it's mon.select, try checking out the old version of
> src/libes/raster/io.c (use the "devices_cleanup_20000420" tag). If
> that fixes it, the problem lies there. If it doesn't, or if it's
> XDRIVER which is crashing, there are more files to consider.
>
> A couple of questions: First, did you delete the old driverlib.a
> libraries? Just doing "cvs update ; make" will have problems. Second,
> were there any warnings relating to src/display/devices/*?
>
> --
> Glynn Clements <glynn.clements@virgin.net>
>
> ----------------------------------------
> If you want to unsubscribe from GRASS Development Team mailing list write to:
> minordomo@geog.uni-hannover.de with
> subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Malcolm Blue wrote:

Sorry, it wasn't your code change that was causing the problem. It was
my system.

XDRIVER was crashing because of some testing I was doing to get
postgresql running. I reversed the cygwin system changes I had made,
did a cold reboot and the grass monitor runs.

I'll do some more testing now. So far, I've just verified that a few
of the display commands work on the XDRIVER (d.vect, d.rast and
d.site).

That's good to know. Cygwin was one of the platform which I though
might be problematic.

I'd like to hear from anyone who has tried it on a system with non-GNU
development tools (i.e. something other than gcc/binutils).

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Glynn

I just tried a fresh checkout of the release branch and there are
problems running d.mon on an SGI using SGI compilers.

The monitor window will pop up but I get the following output:

GRASS 5.0.0pre1 > d.mon start=x1
using default visual which is TrueColor
Visual is read-only or using a private colormap
ncolors: 32768
allocating memory...
Graphics driver [x1] started
Warning - no response from graphics monitor <x1>.
Check to see if the mouse is still active.
ERROR - no response from graphics monitor <x1>.
Problem selecting x1. Will try once more
Warning - no response from graphics monitor <x1>.
Check to see if the mouse is still active.
ERROR - no response from graphics monitor <x1>.

Also, even after the command quits, there are delays in mouse response
until I closed the monitor window using the window manager (click the
close button on the window). I'm not sure what is causing these delays.

I'm sorry but I don't have time to investigate this.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

I just tried a fresh checkout of the release branch and there are
problems running d.mon on an SGI using SGI compilers.

The monitor window will pop up but I get the following output:

GRASS 5.0.0pre1 > d.mon start=x1
using default visual which is TrueColor
Visual is read-only or using a private colormap
ncolors: 32768
allocating memory...
Graphics driver [x1] started
Warning - no response from graphics monitor <x1>.

OK; this means that it connected to the socket but it didn't receive
any data.

Check to see if the mouse is still active.
ERROR - no response from graphics monitor <x1>.
Problem selecting x1. Will try once more
Warning - no response from graphics monitor <x1>.
Check to see if the mouse is still active.
ERROR - no response from graphics monitor <x1>.

Also, even after the command quits, there are delays in mouse response
until I closed the monitor window using the window manager (click the
close button on the window). I'm not sure what is causing these delays.

It's possible that there's a busy-wait in the monitor; does it show
high CPU usage?

I'm sorry but I don't have time to investigate this.

I have time, but I don't have an SGI; if you can obtain any more
information it would help.

I will see if I can find anything obvious; I'm going to look into the
CPU usage and timing issues anyway.

I've checked in one change already; there is a flag ("no_mon") which
is set in the SIGALRM handler and tested in sync_driver(). I've
changed the declaration to include "volatile", although this shouldn't
matter unless the compilation had optimisation enabled.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Glynn

Glynn Clements wrote:

It's possible that there's a busy-wait in the monitor; does it show
high CPU usage?

The system is a dual CPU system and neither CPU was ever maxed at 100%
for my test. The interesting thing is that I am running Grass remotely
on an SGI Origin from an SGI Indigo2 and the system delay is somehow
related to the Indigo2 machine. As a test, I ran a demo program on the
Indigo2 that allows you to rotate a cube, while the XDRIVER was still
running on the Origin. The cube would periodically stop but when it
started rotating again, it skipped to its new positon. That is, the demo
kept running calculating the position of the cube, but the graphics
update was being delayed. I also noticed that the CPU on the Indigo2
machine was spending a lot of time dealing with interrupts

I have time, but I don't have an SGI; if you can obtain any more
information it would help.

I can give a few minutes to test something but I don't have time for
debugging. Oh yeah, the system delay didn't seem to occur until just
before the d.mon command quit. I don't know if that helps or not.

I've checked in one change already; there is a flag ("no_mon") which
is set in the SIGALRM handler and tested in sync_driver(). I've
changed the declaration to include "volatile", although this
shouldn't matter unless the compilation had optimisation enabled.

No optimization was used.

Sorry I couldn't be much help.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

> It's possible that there's a busy-wait in the monitor; does it show
> high CPU usage?

The system is a dual CPU system and neither CPU was ever maxed at 100%
for my test. The interesting thing is that I am running Grass remotely
on an SGI Origin from an SGI Indigo2 and the system delay is somehow
related to the Indigo2 machine. As a test, I ran a demo program on the
Indigo2 that allows you to rotate a cube, while the XDRIVER was still
running on the Origin. The cube would periodically stop but when it
started rotating again, it skipped to its new positon. That is, the demo
kept running calculating the position of the cube, but the graphics
update was being delayed. I also noticed that the CPU on the Indigo2
machine was spending a lot of time dealing with interrupts

Hmm. Looks like XDRIVER might be flooding the X server (although this
is just a guess; I'm not that familiar with X internals).

> I have time, but I don't have an SGI; if you can obtain any more
> information it would help.

I can give a few minutes to test something but I don't have time for
debugging. Oh yeah, the system delay didn't seem to occur until just
before the d.mon command quit. I don't know if that helps or not.

Odd. d.mon/mon.* themselves haven't changed at all.
src/libes/raster/io.c has changed, but only in that two distinct files
were merged into one file with "#ifdef USE_G_SOCKS"; the actual code
being run shouldn't have changed at all.

XDRIVER, OTOH, has changed, mainly to simplify merging all of the
X/non-X and FIFO/socket permutations. The XDRIVER version was quite
different to the generic version, in that the latter did a blocking
accept (or equivalent) followed by a main loop which blocked waiting
for input from the client, while the former had to continuously
processs X input.

I suspect that I may have over-simplified some aspect; I'll re-examine
anything which could change the timing aspects of the I/O.

> I've checked in one change already; there is a flag ("no_mon") which
> is set in the SIGALRM handler and tested in sync_driver(). I've
> changed the declaration to include "volatile", although this
> shouldn't matter unless the compilation had optimisation enabled.

No optimization was used.

OK; the lack of "volatile" won't be the issue here (although it's
omission could have been very significant for an optimised build).

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Glynn Clements wrote:

I suspect that I may have over-simplified some aspect; I'll re-examine
anything which could change the timing aspects of the I/O.

Doh. select() was being called with a zero timeout, i.e. a busy-wait.
The end result: XDRIVER and the X server end up sharing all of the
available CPU time amongst themselves. With the attached patch (about
to be commited), the system remains 98% idle.

--
Glynn Clements <glynn.clements@virgin.net>

(attachments)

connect_sock.c.diff (288 Bytes)

Hi Glynn

Glynn Clements wrote:

Glynn Clements wrote:

> I suspect that I may have over-simplified some aspect; I'll
> re-examine anything which could change the timing aspects of the
> I/O.

Doh. select() was being called with a zero timeout, i.e. a busy-wait.
The end result: XDRIVER and the X server end up sharing all of the
available CPU time amongst themselves. With the attached patch (about
to be commited), the system remains 98% idle.

After updating my sources and after a make distclean and recompile, the
delay problem is fixed. However, I am still getting the error message
indicating that there is no response from the graphics monitor. One
thing to note is that the socket software did work before. Perhaps there
is something that was missed?

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Glynn,

hi all CELL driver friends,

On Tue, Apr 24, 2001 at 12:47:25PM +0700, Justin Hickey wrote:

Hi Glynn

Glynn Clements wrote:
>
> Glynn Clements wrote:
>
> > I suspect that I may have over-simplified some aspect; I'll
> > re-examine anything which could change the timing aspects of the
> > I/O.
>
> Doh. select() was being called with a zero timeout, i.e. a busy-wait.
> The end result: XDRIVER and the X server end up sharing all of the
> available CPU time amongst themselves. With the attached patch (about
> to be commited), the system remains 98% idle.

After updating my sources and after a make distclean and recompile, the
delay problem is fixed. However, I am still getting the error message
indicating that there is no response from the graphics monitor. One
thing to note is that the socket software did work before. Perhaps there
is something that was missed?

while Justin is heading for success, good news from here: The Xdriver works
properly (again) on Linux, no more CPU time consumption after startup.
Thanks, Glynn!

BTW: It seems that the color problem of CELL driver disappeared due to
the driver simplification. At least I cannot reproduce former problems.
Those knowing of CELL driver color problems, please give the new driver
version a try.

Kudos to Glynn,

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi again,

just now I could reproduce a strange monitor behaviour (Eric,
I told you about this). About once a month or so the monitor
catches contents from a netscape window (Linux, KDE2).

What I did: The GRASS monitor was open, the i.colors loading pictures.
Meanwhile I was opening this URL in netscape:
http://www.geog.uni-hannover.de/phygeo/Links/geoinstitute.html

When closing this window, the GRASS monitor got part of the
netscape window contents, looking like this:
http://www.geog.uni-hannover.de/users/neteler/tmp/mon/
-> crazy_monitor.jpg

This happened a few times in the past (since the socket suport is in),
and on a colleagues Linux box as well.

For me it's amazing that such things can happen :slight_smile: It doesn't
disappear, even if I move the GRASS monitor around. The monitor is,
of course, still working. Seems to be some backingstore problem.

Regards

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Markus Neteler wrote:

just now I could reproduce a strange monitor behaviour (Eric,
I told you about this). About once a month or so the monitor
catches contents from a netscape window (Linux, KDE2).

What I did: The GRASS monitor was open, the i.colors loading pictures.
Meanwhile I was opening this URL in netscape:
http://www.geog.uni-hannover.de/phygeo/Links/geoinstitute.html

When closing this window, the GRASS monitor got part of the
netscape window contents, looking like this:
http://www.geog.uni-hannover.de/users/neteler/tmp/mon/
-> crazy_monitor.jpg

This happened a few times in the past (since the socket suport is in),
and on a colleagues Linux box as well.

For me it's amazing that such things can happen :slight_smile: It doesn't
disappear, even if I move the GRASS monitor around. The monitor is,
of course, still working. Seems to be some backingstore problem.

Is the window using backing store ("xwininfo" should say")?

I was going to ask about disabling the use of backing store
altogether. XDRIVER already has a fall-back if backing store isn't
supported, i.e. it does all of its rendering into a Pixmap.

Making it do this all of the time would simplify the code, and
possibly improve reliability (the backing store code seems to be a bit
of a kludge; e.g. Panel_save() moves the window to be on screen, which
suggests that the code will have problems if the window isn't
completely visible). The only potential downside which I can see is
memory usage within the X server.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

> > I suspect that I may have over-simplified some aspect; I'll
> > re-examine anything which could change the timing aspects of the
> > I/O.
>
> Doh. select() was being called with a zero timeout, i.e. a busy-wait.
> The end result: XDRIVER and the X server end up sharing all of the
> available CPU time amongst themselves. With the attached patch (about
> to be commited), the system remains 98% idle.

After updating my sources and after a make distclean and recompile, the
delay problem is fixed. However, I am still getting the error message
indicating that there is no response from the graphics monitor. One
thing to note is that the socket software did work before. Perhaps there
is something that was missed?

I've looked over the code in question extensively now, and can't see
anything which looks like it might be responsible. I don't think that
I'm going to be able to do anything without more information.

What should happen once the connection is established (this much is
happening; you would get a different error if the connect() failed) is
this:

On the client end:

d.mon calls sync_driver(), which sends the BEGIN command (bytes 127,
46) to the driver, then starts reading. It expects to receive at least
32 NUL bytes followed by byte 127, within 15 seconds (a warning is
printed after 5 seconds; after further 10 seconds it gives up).

Now, none of this code has changed at all; I've just merged two
versions of src/libes/raster/io.c into one file with some #ifdefs.

On the driver end:

Once the driver accept()s the connection, it loops reading commands
(and occasionally servicing X events). In the case of the BEGIN
command, it should send 42 NUL bytes then byte 127.

If anything goes wrong, the connection should be closed, which would
produce a different message from d.mon.

This code has changed a bit, so it's a lot more likely that the
problem is within XDRIVER. However, I can't see anything in particular
that looks problematic, so without more information, all that I could
really do is to keep making random changes until it works for you.

Do you have strace() or similar? If so, a trace of the XDRIVER process
would help a lot. If not, could you add some printf()s to
src/display/devices/lib/main.c and see if that tells us anything.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Glynn, Justin,

I've been using this on Cygwin, and can verify that since the select 'zero
timeout' fix, this is working well here. You had already identified the high CPU
usage problem which (was the only problem after my earlier messages), so I got
your fix and have tested successfully with that.

The XDRIVER seems to work fine. Justin, since you're having the problem with
sync_driver() is it possible that libes/raster/io.c is still being copied from
libes/raster/socket.new? /socket.new shouldn;t exist anymore, but....

Malcolm

Glynn Clements wrote:

Justin Hickey wrote:

> > > I suspect that I may have over-simplified some aspect; I'll
> > > re-examine anything which could change the timing aspects of the
> > > I/O.
> >
> > Doh. select() was being called with a zero timeout, i.e. a busy-wait.
> > The end result: XDRIVER and the X server end up sharing all of the
> > available CPU time amongst themselves. With the attached patch (about
> > to be commited), the system remains 98% idle.
>
> After updating my sources and after a make distclean and recompile, the
> delay problem is fixed. However, I am still getting the error message
> indicating that there is no response from the graphics monitor. One
> thing to note is that the socket software did work before. Perhaps there
> is something that was missed?

I've looked over the code in question extensively now, and can't see
anything which looks like it might be responsible. I don't think that
I'm going to be able to do anything without more information.

What should happen once the connection is established (this much is
happening; you would get a different error if the connect() failed) is
this:

On the client end:

d.mon calls sync_driver(), which sends the BEGIN command (bytes 127,
46) to the driver, then starts reading. It expects to receive at least
32 NUL bytes followed by byte 127, within 15 seconds (a warning is
printed after 5 seconds; after further 10 seconds it gives up).

Now, none of this code has changed at all; I've just merged two
versions of src/libes/raster/io.c into one file with some #ifdefs.

On the driver end:

Once the driver accept()s the connection, it loops reading commands
(and occasionally servicing X events). In the case of the BEGIN
command, it should send 42 NUL bytes then byte 127.

If anything goes wrong, the connection should be closed, which would
produce a different message from d.mon.

This code has changed a bit, so it's a lot more likely that the
problem is within XDRIVER. However, I can't see anything in particular
that looks problematic, so without more information, all that I could
really do is to keep making random changes until it works for you.

Do you have strace() or similar? If so, a trace of the XDRIVER process
would help a lot. If not, could you add some printf()s to
src/display/devices/lib/main.c and see if that tells us anything.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Malcolm Blue wrote:

The XDRIVER seems to work fine. Justin, since you're having the problem with
sync_driver() is it possible that libes/raster/io.c is still being copied from
libes/raster/socket.new? /socket.new shouldn;t exist anymore, but....

.. it shouldn't matter if it did.

The old libes/raster/io.c should work fine with the new XDRIVER, and
vice-versa. The protocol hasn't changed, and nothing on the client
side depends upon the code in src/display/devices.

The only thing that changed regarding src/libes/raster/io.c was that
the fifo.orig/socket.new versions were merged into a single file which
used #ifdef USE_G_SOCKS to determine which code to use.

The changes in src/display/devices were much more substantial.
Previously, each driver had it's own version of the transport code
(connect.c and bits of SWITCHER.c); well, apart from the fact that the
sockets stuff was only implemented for XDRIVER; the other drivers
wouldn't work if you used sockets.

Merging this wasn't all that straightforward, due the need for XDRIVER
to process X events while waiting for a connection.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Glynn,

Glynn Clements wrote:

Malcolm Blue wrote:

> The XDRIVER seems to work fine. Justin, since you're having the problem with
> sync_driver() is it possible that libes/raster/io.c is still being copied from
> libes/raster/socket.new? /socket.new shouldn;t exist anymore, but....

.. it shouldn't matter if it did.

That's true. It shouldn't matter if the sockets.new/io.c routine was copied over
the new io.c. Somehow I was thinking in reverse. It would only happen if somehow
he was using the old libes/raster/fifo.orig/io.c version with the new
src/display/devices/* code compiled with USE_G_SOCKS.

Or maybe libraster didn't get removed.

The thing that made me think of this was the errors reported when the ipc version of
XDRIVER was mixed with the fifo version of io.c.

The old libes/raster/io.c should work fine with the new XDRIVER, and
vice-versa. The protocol hasn't changed, and nothing on the client
side depends upon the code in src/display/devices.

The io.c routines varied from fifo/ipc/sockets. Thats how the clients communicate
with the display drivers. They all have to be in sync with code from
src/display/devices. That's why I mentioned it. All of the d.mon/pgms programs
link to rasterlib. If there was an error in recreating this lib, then the clients
won't be communicating on the same pipe/socket.

The only thing that changed regarding src/libes/raster/io.c was that
the fifo.orig/socket.new versions were merged into a single file which
used #ifdef USE_G_SOCKS to determine which code to use.

Right....But, if he's not using the new io.c code (or the libraster version of
that....)?

The changes in src/display/devices were much more substantial.
Previously, each driver had it's own version of the transport code
(connect.c and bits of SWITCHER.c); well, apart from the fact that the
sockets stuff was only implemented for XDRIVER; the other drivers
wouldn't work if you used sockets.

Sockets stuff was in src/display/lib for the other drivers. It worked for me when I
tested (only tested a little). Now you have both XDRIVER and all other drivers in
sync through one set of communication routines. A HUGE improvement.

Merging this wasn't all that straightforward, due the need for XDRIVER
to process X events while waiting for a connection.

No. I think you did a great job. This will make everyone's maintenance a lot
easier and, more importantly, prevent fixes to one driver that don't get applied to
other drivers.

I've looked at the code changes you made and am amazed at how cleanly you did it.
It will make a huge difference for anyone trying to add new drivers, or even
understanding the drivers.

Not only did you merge common code into common routines, you merged almost all of
the sockets and fifo code into common routines - where they can be easily
distinguished.

Markus and I have already reported success with this new version. I think your code
changes are very successful. I'm just suggesting what may have gone wrong for
Justin.

Malcolm

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Malcolm Blue wrote:

> The old libes/raster/io.c should work fine with the new XDRIVER, and
> vice-versa. The protocol hasn't changed, and nothing on the client
> side depends upon the code in src/display/devices.

The io.c routines varied from fifo/ipc/sockets. Thats how the clients communicate
with the display drivers. They all have to be in sync with code from
src/display/devices. That's why I mentioned it. All of the d.mon/pgms programs
link to rasterlib. If there was an error in recreating this lib, then the clients
won't be communicating on the same pipe/socket.

We can ignore such possibilities; the "no response from graphics
monitor" message can only be printed after a successful connect.

> The only thing that changed regarding src/libes/raster/io.c was that
> the fifo.orig/socket.new versions were merged into a single file which
> used #ifdef USE_G_SOCKS to determine which code to use.
>

Right....But, if he's not using the new io.c code (or the libraster version of
that....)?

If he were trying to using the wrong transport, he'd get a different
error message.

> The changes in src/display/devices were much more substantial.
> Previously, each driver had it's own version of the transport code
> (connect.c and bits of SWITCHER.c); well, apart from the fact that the
> sockets stuff was only implemented for XDRIVER; the other drivers
> wouldn't work if you used sockets.
>

Sockets stuff was in src/display/lib for the other drivers.

Oh yes, that's right; there were two SWITCHER.c's (lib and
XDRIVER/XDRIVER24), and two versions of each of them. Only XDRIVER had
multiple versions of connect.c, but the socket code didn't use it.

> Merging this wasn't all that straightforward, due the need for XDRIVER
> to process X events while waiting for a connection.

No. I think you did a great job.

"Did" is maybe a tad premature at this point; If I can solve Justin's
problems, we can use past tense :wink:

This will make everyone's maintenance a lot
easier and, more importantly, prevent fixes to one driver that don't get applied to
other drivers.

Sure; otherwise I probably would have left it alone, with the release
approaching.

Similar reasoning also applies to the batch of changes which I've just
committed (sanitising coordinate semantics), although none of those
are likely to cause total failure.

In a lot of cases, I couldn't figure out whether something should be
"<" or "<=", or where there should be a +/- 1, and it looked like
others before me didn't know either. So I made everything use "usual"
semantics, i.e. top/left inclusive, bottom/right exclusive.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'