[GRASS-dev] Vect_*_fatal_error()

Hi all,

the vector library still uses in GRASS7 GV_FATAL_EXIT/PRINT/RETURN
defines, see Vect_set_fatal_error() for details. It was introduced in
r10426.

I know that this topic "G_fatal_error() and exit() call" has been
discussed several times in the past. The GRASS libraries has been
designed for Unix-like modules, which just starts, do processing and
terminates. This absolutely acceptable reason why G_fatal_error()
should call exit() when fatal_error appears. On the other hand I still
think that the programmer should have some possibility to avoid
calling exit() on G_fatal_error() of course on his/her own risk.
Anyway this option should be there with BIG warning at least but it
should be there...

Please review the attach patch which moves Vect_*_fatal_error() to the gislib.

Martin

[1] http://trac.osgeo.org/grass/changeset/10426/grass/trunk/lib/vector/Vlib/error.c

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

(attachments)

fatal_error.diff (2.52 KB)

Martin wrote:

On the other hand I still think that the programmer should have some
possibility to avoid calling exit() on G_fatal_error() of course on
his/her own risk. Anyway this option should be there with BIG warning
at least but it should be there...

if it can be recovered from it's not fatal, therefore G_fatal_error() wasn't
appropriate to use in the first place.

perhaps you want a near-clone called G_nonfatal_error() which does the
same without the exit()? but is that just a semantics game between the
definition of what is a "warning" and what it means to be an "error"?

I would not be a fan of library-specific error handling anyway, make what-
ever solution available for all from libgis.

what's the practical use case you've found which could use it?

Hamish

Hi,

2011/11/10 Hamish <hamish_b@yahoo.com>:

what's the practical use case you've found which could use it?

it's very very easy. The GRASS libraries has been originally designed
for GRASS modules (read command parameters, do processing,
terminates). Unfortunately there are applications which using GRASS
libraries and they do not behaves like GRASS modules. It started with
QGIS plugin for GRASS and it continues with all wxGUI components which
are using GRASS libraries via ctypes. Just try to use vector digitizer
or wxNViz, the whole GUI just crashes without any kind of message when
G_fatal_error() is called. Well due to design, anyway very user
friendly. I am not a fan of G_set_fatal_error() solution, on the other
hand I don't see any reason why to *forbid* so strictly the
programmers which are using GRASS libraries to avoid calling system
exit on G_fatal_error(). The default behaviour will remain! We just
let other programmers (with a big warning) to use GRASS libraries for
the purpose which they were not originally designed nothing more.

Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

Martin Landa wrote:

the vector library still uses in GRASS7 GV_FATAL_EXIT/PRINT/RETURN
defines, see Vect_set_fatal_error() for details. It was introduced in
r10426.

I know that this topic "G_fatal_error() and exit() call" has been
discussed several times in the past. The GRASS libraries has been
designed for Unix-like modules, which just starts, do processing and
terminates. This absolutely acceptable reason why G_fatal_error()
should call exit() when fatal_error appears. On the other hand I still
think that the programmer should have some possibility to avoid
calling exit() on G_fatal_error() of course on his/her own risk.
Anyway this option should be there with BIG warning at least but it
should be there...

Please review the attach patch which moves Vect_*_fatal_error() to the gislib.

The G_set_fatal_error() concept is broken. Code which calls
G_fatal_error() expects it to be fatal. Other code isn't entitled to
have a say in that.

You appear to have overlooked this:

  void G_fatal_error(const char *, ...) __attribute__ ((format(printf, 1, 2)))
      __attribute__ ((noreturn));

I'm talking about the "__attribute__ ((noreturn))" part. That is
absolutely fundamental to the behaviour of G_fatal_error() and cannot
be changed.

--
Glynn Clements <glynn@gclements.plus.com>

Martin Landa wrote:

it continues with all wxGUI components which are using GRASS

libraries via ctypes.

The number of such components should be zero.

Just try to use vector digitizer
or wxNViz, the whole GUI just crashes without any kind of message when
G_fatal_error() is called.

Yep. The solution has been made clear on multiple occasions: don't
access GRASS libraries from the main GUI process. wxNVIZ in particular
should absolutely be a separate process (OpenGL drivers are not noted
for their stability).

Even if the exit()-on-fatal-error issue magically disappeared, the
memory management issues (i.e. don't bother tracking allocations which
don't matter for a typical module) won't. Nor will the fact that
"try ... except" won't catch a segfault.

grass.lib.* exists for "module-style" scripts, not for persistent
applications.

Well due to design, anyway very user
friendly. I am not a fan of G_set_fatal_error() solution, on the other
hand I don't see any reason why to *forbid* so strictly the
programmers which are using GRASS libraries to avoid calling system
exit on G_fatal_error(). The default behaviour will remain! We just
let other programmers (with a big warning) to use GRASS libraries for
the purpose which they were not originally designed nothing more.

Allowing G_fatal_error() to return isn't going to stop wxGUI, QGIS,
etc terminating unexpectedly. It's just going to make them terminate
via a segfault rather than via exit().

If a function calls G_fatal_error(), it isn't expecting it to return.
If it does return, the usual result will be that the caller attempts
to use invalid data; a crash isn't likely to be far off.

A typical example of G_fatal_error() usage is:

  FILE *fp = fopen(...);
  if (!fp)
      G_fatal_error(...);
  fread(..., fp);

Allowing G_fatal_error() to return will simply replace the exit() with
a segfault. Why would this be an improvement?

If you want to (attempt to) recover from fatal errors, you have to
escape from G_fatal_error() via a longjmp(). Either that, or you have
to re-write every single use of G_fatal_error() (~3800 of them) to
accomodate the possibility of it returning.

Except that the latter isn't really an acceptable solution, as it
pushes the burden of error handling on to everything else. You might
consider making your job easier justifies making everyone else's
harder, but don't expect everyone else to agree.

So long as I have commit privileges, G_fatal_error() is going to
retain (and honour) the "noreturn" attribute.

Aside: In 7.0, I changed a number of library functions which had
previously returned a status code to call G_fatal_error() instead.
This meant eliminating the status-checking from the callers, as the
return type changed from "int" to "void".

Which lead me to discover that, for modules, actually bothering to
check the status code tended to be the exception rather than the rule.
IOW, even with most functions using G_fatal_error() to eliminate the
need to check status codes, the burden of dealing with the few which
didn't was still too much.

--
Glynn Clements <glynn@gclements.plus.com>

Nowadays there is wxGUI code which uses calls via ctypes. Would it by
possible to temporary change behaviour of G_fatal_error() to enable
redirect module output back to console and then print message and do
exit? (AFAIK, now an error message is showed in GUI and then GUI
fails, so user and programmer don't see the message.)

I know that this solution doesn't follow the idea of calling modules
(and not using library directly), but now it seems to be much easier
to do this change than rewrite parts of wxGUI which uses ctypes.

Wenzeslaus wrote:

Nowadays there is wxGUI code which uses calls via ctypes. Would it by
possible to temporary change behaviour of G_fatal_error() to enable
redirect module output back to console and then print message and do
exit? (AFAIK, now an error message is showed in GUI and then GUI
fails, so user and programmer don't see the message.)

Maybe. G_fatal_error() could (try to) write to /dev/tty rather than
stderr. But that won't work if there isn't a controlling tty (and may
be annoying if you have intentionally redirected stderr).

Or G__gisinit() could store fileno(stderr) at start-up; but that won't
work if the GUI closes stderr when it redirects it. More robust would
be e.g.:

  error_fp = fdopen(dup(fileno(stderr)), "w");

I know that this solution doesn't follow the idea of calling modules
(and not using library directly), but now it seems to be much easier
to do this change than rewrite parts of wxGUI which uses ctypes.

That needs to be done anyhow. The longer people ignore this fact and
hope that it somehow just goes away, the more work will ultimately be
required to fix it.

Note that the problem isn't the use of ctypes per se, but its use in
the main GUI process. Something like nviz can't avoid using ctypes,
but it needs to be a separate process (i.e. the GUI should "execute"
the nviz component, not "import" it). That way, a fatal error or a
signal only kills the nviz component, not the whole GUI (well, unless
it's a video driver bug; those frequently kill the X server and
occasionally the kernel, so whether or not it kills the GUI is moot).

--
Glynn Clements <glynn@gclements.plus.com>

Hi,
i respect the G_fatal_error() design decision and i think it is a very
good decision for a C based library which should be used to implement
processing modules. What puzzles me is, that interactive vector
digitizing is IMHO hard to implement using this design decision. I
think that was the reason that many functions of the vector library
have return values rather than calling G_fatal_error().

As Glynn stated you will face unexpected behavior when forcing
G_fatal_error() not exiting, because of the not existing memory
management/error handling in the library design after a
G_fatal_error() call.

IMHO one solution is to keep editing functions in the vector library
G_fatal_error() free, implementing a robust error handling and memory
management for such functions. Higher level functions which should be
used in processing modules should implement the G_fatal_error()
behavior. The core vector/gis library functions should still use
G_fatal_error().

The next point is to have a separate vector digitizing process for the
wxGUI digitizer. How should this work? Using RPC to have access to the
vector library in wx C++ code? Implementing special editing modules
... which make vector editing IMHO very slow?

Best regards
Soeren

2011/11/12 Glynn Clements <glynn@gclements.plus.com>:

Martin Landa wrote:

it continues with all wxGUI components which are using GRASS

libraries via ctypes.

The number of such components should be zero.

Just try to use vector digitizer
or wxNViz, the whole GUI just crashes without any kind of message when
G_fatal_error() is called.

Yep. The solution has been made clear on multiple occasions: don't
access GRASS libraries from the main GUI process. wxNVIZ in particular
should absolutely be a separate process (OpenGL drivers are not noted
for their stability).

Even if the exit()-on-fatal-error issue magically disappeared, the
memory management issues (i.e. don't bother tracking allocations which
don't matter for a typical module) won't. Nor will the fact that
"try ... except" won't catch a segfault.

grass.lib.* exists for "module-style" scripts, not for persistent
applications.

Well due to design, anyway very user
friendly. I am not a fan of G_set_fatal_error() solution, on the other
hand I don't see any reason why to *forbid* so strictly the
programmers which are using GRASS libraries to avoid calling system
exit on G_fatal_error(). The default behaviour will remain! We just
let other programmers (with a big warning) to use GRASS libraries for
the purpose which they were not originally designed nothing more.

Allowing G_fatal_error() to return isn't going to stop wxGUI, QGIS,
etc terminating unexpectedly. It's just going to make them terminate
via a segfault rather than via exit().

If a function calls G_fatal_error(), it isn't expecting it to return.
If it does return, the usual result will be that the caller attempts
to use invalid data; a crash isn't likely to be far off.

A typical example of G_fatal_error() usage is:

   FILE \*fp = fopen\(\.\.\.\);
   if \(\!fp\)
       G\_fatal\_error\(\.\.\.\);
   fread\(\.\.\., fp\);

Allowing G_fatal_error() to return will simply replace the exit() with
a segfault. Why would this be an improvement?

If you want to (attempt to) recover from fatal errors, you have to
escape from G_fatal_error() via a longjmp(). Either that, or you have
to re-write every single use of G_fatal_error() (~3800 of them) to
accomodate the possibility of it returning.

Except that the latter isn't really an acceptable solution, as it
pushes the burden of error handling on to everything else. You might
consider making your job easier justifies making everyone else's
harder, but don't expect everyone else to agree.

So long as I have commit privileges, G_fatal_error() is going to
retain (and honour) the "noreturn" attribute.

Aside: In 7.0, I changed a number of library functions which had
previously returned a status code to call G_fatal_error() instead.
This meant eliminating the status-checking from the callers, as the
return type changed from "int" to "void".

Which lead me to discover that, for modules, actually bothering to
check the status code tended to be the exception rather than the rule.
IOW, even with most functions using G_fatal_error() to eliminate the
need to check status codes, the burden of dealing with the few which
didn't was still too much.

--
Glynn Clements <glynn@gclements.plus.com>
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Sören Gebbert wrote:

i respect the G_fatal_error() design decision and i think it is a very
good decision for a C based library which should be used to implement
processing modules. What puzzles me is, that interactive vector
digitizing is IMHO hard to implement using this design decision.

It's only a problem if fatal errors actually occur. Sanitising user
input may help with this.

The Python community has the terms EAFP and LBYL, acronyms for "Easier
to Ask for Forgiveness than Permission" and "Look Before You Leap"
respectively. EAFP describes the try-except idiom, where you ensure
that code deals with errors, while LBYL refers to validating inputs to
ensure that errors don't occur in the first place.

GRASS' "fatal" error concept makes EAFP problematic, so some degree of
LBYL is essential for a persistent application.

As Glynn stated you will face unexpected behavior when forcing
G_fatal_error() not exiting, because of the not existing memory
management/error handling in the library design after a
G_fatal_error() call.

Actually, "unexpected behaviour" is a problem with longjmp()ing out of
a fatal error handler. If you were to just allow the error handler to
return, the behaviour would be much more straightforward: you'll get a
segfault when you dereference the null pointer which was the reason
for signalling a fatal error in the first place. Or when you try to
read pointers or array indices from uninitialised memory (where the
failure of whichever function was supposed to initialise it was the
reason for the fatal error).

If you don't want fatal errors, you'll have to re-write the libraries
not to use G_fatal_error(). Just making G_fatal_error() itself return
isn't going to achieve anything useful.

IMHO one solution is to keep editing functions in the vector library
G_fatal_error() free, implementing a robust error handling and memory
management for such functions.

The main problem with this is that it clutters up the code. Normally,
I'd suggest still calling G_fatal_error(), but performing any
necessary clean-up beforehand, based upon the assumption that the
process will be using setjmp/longjmp to recover from the error.

Except .. that's going to need some support, as you can't use setjmp()
directly from Python (the jmp_buf becomes invalid once you return from
the function which "called" setjmp()). We would need something like:

  static jmp_buf jbuf;

  static void the_handler(void *unused)
  {
      longjmp(jbuf, 1);
  }

  void *G_safe_execute(void *(*func)(void *), void *closure)
  {
      G_add_error_handler(the_handler, NULL);
      if (setjmp(jbuf) == 0) {
          G_remove_error_handler(the_handler, NULL);
          return (*func)(closure);
      }
      else {
          G_remove_error_handler(the_handler, NULL);
          return NULL;
      }
  }

The next point is to have a separate vector digitizing process for the
wxGUI digitizer. How should this work? Using RPC to have access to the
vector library in wx C++ code? Implementing special editing modules
... which make vector editing IMHO very slow?

No. Just make the digitiser a separate process which imports
grass.lib.vector etc. If it crashes, you just lose the digitiser
rather than losing the entire GUI.

Using RPC in Python is quite straightforward with the "pickle" module,
but I don't think that it would help much. A fatal error in the RPC
server will kill the RPC server; having the client survive only helps
if it can recover from that.

--
Glynn Clements <glynn@gclements.plus.com>

On 14/11/11 21:53, Glynn Clements wrote:

Sören Gebbert wrote:

The next point is to have a separate vector digitizing process for the
wxGUI digitizer. How should this work? Using RPC to have access to the
vector library in wx C++ code? Implementing special editing modules
... which make vector editing IMHO very slow?

No. Just make the digitiser a separate process which imports
grass.lib.vector etc. If it crashes, you just lose the digitiser
rather than losing the entire GUI.

Jumping in out of no where here, but I would like to understand the fundamental issue here: would using a separate process imply having to use a separate display ? The whole idea of the current integration in the wxgui as I understand it is to allow digitizing a map in the current map display. If this is not possible with a separate process, then the debate should probably be about that...

I like the way the new digitizer integrates into the map display, but I find even more important to not fundamentally modify the logic of grass as a collection of small modules. So if the former would imply the latter, then I would prefer dropping that and keeping a separate window for elements such as the digitzer and nviz.

Moritz

Hi,

2011/11/12 Glynn Clements <glynn@gclements.plus.com>:

Even if the exit()-on-fatal-error issue magically disappeared, the
memory management issues (i.e. don't bother tracking allocations which
don't matter for a typical module) won't. Nor will the fact that
"try ... except" won't catch a segfault.

grass.lib.* exists for "module-style" scripts, not for persistent
applications.

that's right, but we just shouldn't forbid to use GRASS libraries for
such applications (for which the GRASS libraries were not designed).
As we all know they were designed for the GRASS modules. I just think
that it's very very bad political decision to *forbid* the programmer
to changes behaviour of G_fatal_error(), or better to say to avoid
calling exit() if he/she really wants to go on even if there was a
fatal error. It's not over business, we just say, that the GRASS
libraries has been designed for that cases, so you will remain in
undefined stage and you should force the user to quit the whole
application or so. The reason why I raised this issue again I think
that giving no chance to change behaviour of G_fatal_error() is just
ignoring and making life very difficult to non-GRASS or currently
wxGUI programmers. I really think that we should give the programmer
possibility to override G_fatal_error() by something which is not
calling exit() if he/she really needed and let him/her know that the
GRASS libraries remains in undefined stage. It's not over business,
it's up to the programmer.

Well due to design, anyway very user
friendly. I am not a fan of G_set_fatal_error() solution, on the other
hand I don't see any reason why to *forbid* so strictly the
programmers which are using GRASS libraries to avoid calling system
exit on G_fatal_error(). The default behaviour will remain! We just
let other programmers (with a big warning) to use GRASS libraries for
the purpose which they were not originally designed nothing more.

Allowing G_fatal_error() to return isn't going to stop wxGUI, QGIS,
etc terminating unexpectedly. It's just going to make them terminate
via a segfault rather than via exit().

Not always in every case. What about Rast_open_old() which calls also
G_fatal_error(). If you try in that application to open raster map
(which fails) the application just crashes without no message. Good;-)
You have no possibility to show message what failed why it failed and
eg. force the user to quit the application if needed.

If a function calls G_fatal_error(), it isn't expecting it to return.
If it does return, the usual result will be that the caller attempts
to use invalid data; a crash isn't likely to be far off.

A typical example of G_fatal_error() usage is:

   FILE \*fp = fopen\(\.\.\.\);
   if \(\!fp\)
       G\_fatal\_error\(\.\.\.\);
   fread\(\.\.\., fp\);

Allowing G_fatal_error() to return will simply replace the exit() with
a segfault. Why would this be an improvement?

The improvement is that you can show at least some error message and
to quit application with knowing user that the fatal error appeared,
where appeared, etc. In the most cases it will be bug in the
application, at least the application will just not crash without any
message.

If you want to (attempt to) recover from fatal errors, you have to
escape from G_fatal_error() via a longjmp(). Either that, or you have
to re-write every single use of G_fatal_error() (~3800 of them) to
accomodate the possibility of it returning.

I would prefer to modify G_fatal_error() rather then use longjmp()
which is just basically a workaround. Moreover the application can
still fail in the same way. It's not a safe solution.

Except that the latter isn't really an acceptable solution, as it
pushes the burden of error handling on to everything else. You might
consider making your job easier justifies making everyone else's
harder, but don't expect everyone else to agree.

So long as I have commit privileges, G_fatal_error() is going to
retain (and honour) the "noreturn" attribute.

WHY SO? That are very very strong words to me! I thought that the
GRASS project is still community based and nobody has a veto
privilege. Probably I am wrong. Please consider also my reasoning.

Aside: In 7.0, I changed a number of library functions which had
previously returned a status code to call G_fatal_error() instead.
This meant eliminating the status-checking from the callers, as the
return type changed from "int" to "void".

Right, as for Rast_open_old(), the result is that the application for
which the GRASS library hasn't been designed eg. the wxGUI just
crashes without no message, eg. if raster map doesn't exits. Is it
really fatal? Even I don't consider as fatal error if you open raster
and it fails. For GRASS modules it just terminate the process, for
long running application it means complete crash. This is bad. It just
avoids need to call before G_find_raster() in normal GRASS modules and
make it harder for long running application.

Which lead me to discover that, for modules, actually bothering to
check the status code tended to be the exception rather than the rule.
IOW, even with most functions using G_fatal_error() to eliminate the
need to check status codes, the burden of dealing with the few which
didn't was still too much.

[...]

Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

Hi,

2011/11/12 Wenzeslaus <wenzeslaus@gmail.com>:

I know that this solution doesn't follow the idea of calling modules
(and not using library directly), but now it seems to be much easier
to do this change than rewrite parts of wxGUI which uses ctypes.

I am afraid that rewriting such wxGUI components with the number of
active developers (GRASS or better to wxGUI only) it's quite
unrealistic task. Moreover it would go against the integration
efforts. It would basically mean to separate eg. vector digitizer to
the standalone application similarly how worked old-fashioned
`v.digit` module. It means to make step(s) back (in my POV) and to
split the whole wxGUI to more standalone applications.

Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

Hi Martin,

2011/11/17 Martin Landa <landa.martin@gmail.com>:

Hi,

2011/11/12 Glynn Clements <glynn@gclements.plus.com>:

Even if the exit()-on-fatal-error issue magically disappeared, the
memory management issues (i.e. don't bother tracking allocations which
don't matter for a typical module) won't. Nor will the fact that
"try ... except" won't catch a segfault.

grass.lib.* exists for "module-style" scripts, not for persistent
applications.

that's right, but we just shouldn't forbid to use GRASS libraries for
such applications (for which the GRASS libraries were not designed).

This is not forbidden ... but it is a very bad idea to do so.

As we all know they were designed for the GRASS modules. I just think
that it's very very bad political decision to *forbid* the programmer
to changes behaviour of G_fatal_error(), or better to say to avoid
calling exit() if he/she really wants to go on even if there was a
fatal error. It's not over business, we just say, that the GRASS
libraries has been designed for that cases, so you will remain in
undefined stage and you should force the user to quit the whole
application or so. The reason why I raised this issue again I think
that giving no chance to change behaviour of G_fatal_error() is just
ignoring and making life very difficult to non-GRASS or currently
wxGUI programmers. I really think that we should give the programmer
possibility to override G_fatal_error() by something which is not
calling exit() if he/she really needed and let him/her know that the
GRASS libraries remains in undefined stage. It's not over business,
it's up to the programmer.

You already can override G_fatal_error() to avoid exit() by using
G_set/add_error_routine() using longjmp and setjmp. The
vtk-grass-bridge uses this feature to catch G_fatal_error() exit()
calls. Due to the undefined state of the library after calling
G_fatal_error(), this is only useful to prepare meaningful error
messages and pass them to the persistent application to inform the
user to exit the persistent application or to restart it.

IMHO the main wxGUI should not make use of library functions which
call G_fata_error() directly, it should use grass modules or start
Python processes which call grass library functions. These processes
can make use of Glynns Python longjmp approach to pass meaningful
error messages to the wxGUI before they get restarted by the wxGUI.

Well due to design, anyway very user
friendly. I am not a fan of G_set_fatal_error() solution, on the other
hand I don't see any reason why to *forbid* so strictly the
programmers which are using GRASS libraries to avoid calling system
exit on G_fatal_error(). The default behaviour will remain! We just
let other programmers (with a big warning) to use GRASS libraries for
the purpose which they were not originally designed nothing more.

Allowing G_fatal_error() to return isn't going to stop wxGUI, QGIS,
etc terminating unexpectedly. It's just going to make them terminate
via a segfault rather than via exit().

Not always in every case. What about Rast_open_old() which calls also

But in most cases.

G_fatal_error(). If you try in that application to open raster map
(which fails) the application just crashes without no message. Good;-)
You have no possibility to show message what failed why it failed and
eg. force the user to quit the application if needed.

You have.

If a function calls G_fatal_error(), it isn't expecting it to return.
If it does return, the usual result will be that the caller attempts
to use invalid data; a crash isn't likely to be far off.

A typical example of G_fatal_error() usage is:

   FILE \*fp = fopen\(\.\.\.\);
   if \(\!fp\)
       G\_fatal\_error\(\.\.\.\);
   fread\(\.\.\., fp\);

Allowing G_fatal_error() to return will simply replace the exit() with
a segfault. Why would this be an improvement?

The improvement is that you can show at least some error message and
to quit application with knowing user that the fatal error appeared,
where appeared, etc. In the most cases it will be bug in the
application, at least the application will just not crash without any
message.

If you want to (attempt to) recover from fatal errors, you have to
escape from G_fatal_error() via a longjmp(). Either that, or you have
to re-write every single use of G_fatal_error() (~3800 of them) to
accomodate the possibility of it returning.

I would prefer to modify G_fatal_error() rather then use longjmp()
which is just basically a workaround. Moreover the application can
still fail in the same way. It's not a safe solution.

Using longjmp() is not a work around, its IMHO the only meaningful solution.
I would like to cite Glynn again:
" If you want to (attempt to) recover from fatal errors, you have to
escape from G_fatal_error() via a longjmp(). Either that, or you have
to re-write every single use of G_fatal_error() (~3800 of them) to
accomodate the possibility of it returning."

Except that the latter isn't really an acceptable solution, as it
pushes the burden of error handling on to everything else. You might
consider making your job easier justifies making everyone else's
harder, but don't expect everyone else to agree.

So long as I have commit privileges, G_fatal_error() is going to
retain (and honour) the "noreturn" attribute.

WHY SO? That are very very strong words to me! I thought that the
GRASS project is still community based and nobody has a veto
privilege. Probably I am wrong. Please consider also my reasoning.

These are indeed strong words. But i also think its a very bad idea to
change this design decision.
I was starting the discussion some time ago to prepare GRASS libraries
for persistent applications.
I learned and have changed my mind. Because the fundamental design of
GRASS and therefor all implemented modules depend on the exit() on
fatal error design decision. It will be almost impossible to change
this if you don't want to rewrite most of the code.

A solution is to use processes and communicate between them with
serialized objects. Thats "easy" in Python and Java but quite hard in
C/C++.

Consider the main wxGUI as the master process, able to start, stop and
restart subprocesses. These subprocesses are doing specific tasks
(vector editing, raster cell editing, ...) by accessing the GRASS
library directly and using longjmp() to report to the master process
in case of a G_fatal_error() call.

Best regards
Soeren

Hi,

2011/11/17 Sören Gebbert <soerengebbert@googlemail.com>:

grass.lib.* exists for "module-style" scripts, not for persistent
applications.

that's right, but we just shouldn't forbid to use GRASS libraries for
such applications (for which the GRASS libraries were not designed).

This is not forbidden ... but it is a very bad idea to do so.

well, word "forbidden" is not right in this sense, I meant "to make it
harder as much as possible".

Consider the main wxGUI as the master process, able to start, stop and
restart subprocesses. These subprocesses are doing specific tasks
(vector editing, raster cell editing, ...) by accessing the GRASS
library directly and using longjmp() to report to the master process
in case of a G_fatal_error() call.

Probably due to very limit time which I currently have for wxGUI
development I was inclining to the quick, but probably good solution.
I will investigate more as I will have some time for that.

Thanks for clarification, Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

Hi,

2011/11/14 Glynn Clements <glynn@gclements.plus.com>:

Except .. that's going to need some support, as you can't use setjmp()
directly from Python (the jmp_buf becomes invalid once you return from
the function which "called" setjmp()). We would need something like:

   static jmp\_buf jbuf;

   static void the\_handler\(void \*unused\)
   \{
       longjmp\(jbuf, 1\);
   \}

   void \*G\_safe\_execute\(void \*\(\*func\)\(void \*\), void \*closure\)
   \{
       G\_add\_error\_handler\(the\_handler, NULL\);
       if \(setjmp\(jbuf\) == 0\) \{
           G\_remove\_error\_handler\(the\_handler, NULL\);
           return \(\*func\)\(closure\);
       \}
       else \{
           G\_remove\_error\_handler\(the\_handler, NULL\);
           return NULL;
       \}
   \}

this could be useful addition to gislib library. +1 to add
G-safe_execute() to gislib.

Thanks, Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa

Martin Landa wrote:

> I know that this solution doesn't follow the idea of calling modules
> (and not using library directly), but now it seems to be much easier
> to do this change than rewrite parts of wxGUI which uses ctypes.

I am afraid that rewriting such wxGUI components with the number of
active developers (GRASS or better to wxGUI only) it's quite
unrealistic task.

Leaving things as it is and hoping that the reliability issues "go
away" is even more unrealistic. At least a re-write will eventually
result in a solution.

Moreover it would go against the integration
efforts. It would basically mean to separate eg. vector digitizer to
the standalone application similarly how worked old-fashioned
`v.digit` module. It means to make step(s) back (in my POV) and to
split the whole wxGUI to more standalone applications.

The alternative is something which cannot help but be unreliable.

--
Glynn Clements <glynn@gclements.plus.com>

Martin Landa wrote:

> If a function calls G_fatal_error(), it isn't expecting it to return.
> If it does return, the usual result will be that the caller attempts
> to use invalid data; a crash isn't likely to be far off.
>
> A typical example of G_fatal_error() usage is:
>
> FILE *fp = fopen(...);
> if (!fp)
> G_fatal_error(...);
> fread(..., fp);
>
> Allowing G_fatal_error() to return will simply replace the exit() with
> a segfault. Why would this be an improvement?

The improvement is that you can show at least some error message and
to quit application with knowing user that the fatal error appeared,
where appeared, etc. In the most cases it will be bug in the
application, at least the application will just not crash without any
message.

Do you actually not understand the problem here? The above example
will segfault immediately after G_fatal_error() returns.

G_fatal_error() does two things:

1. Prints an error message.
2. Prevents execution of subsequent code.

Of these, #2 is the important property; #1 is secondary.

When code calls G_fatal_error(), it isn't simply asking for a
diagnostic to be printed. Far more importantly, it both expects it to
abort execution of the current function (like a "return" statement),
and also to alleviate it of the responsibility for informing the
caller that it hasn't succeeded.

If we were using C++, the solution would be to raise an exception. If
the code wants to handle errors, it can catch the exception, otherwise
it can let it propagate up until either something does catch it or it
terminates the process.

But we aren't using C++, and that isn't likely to change. Apart from a
significant increase in compilation times, C++ is a far more complex
language than C, which is likely to be a problem when some of the
GRASS developers barely understand C.

The closest thing that C has to exceptions is longjmp() (which is
actually how exceptions were implemented on the earliest C++
compilers).

--
Glynn Clements <glynn@gclements.plus.com>

Sören Gebbert wrote:

IMHO the main wxGUI should not make use of library functions which
call G_fata_error() directly, it should use grass modules or start
Python processes which call grass library functions. These processes
can make use of Glynns Python longjmp approach to pass meaningful
error messages to the wxGUI before they get restarted by the wxGUI.

A separate process wouldn't need a longjmp(). Once it has passed the
necessary information to the parent, it can just allow the error
handler to return and the process to terminate.

The downside with RPC mechanisms is that even the most straightforward
function call is complicated by the need for error handling. Instead
of asking for the value of "2 + 2" and getting 4, you either get 4 or
"RPC server not responding, please try again".

This can be mitigated to an extent in Python by having a generic
wrapper which throws exceptions for RPC failures, so that code isn't
cluttered up by having to propagate status indications.

The limitation is that recovery may require not only restarting the
child process, but restoring its state (e.g. any maps which were open
in the old process will need to be re-opened before thay can be
referenced).

I learned and have changed my mind. Because the fundamental design of
GRASS and therefor all implemented modules depend on the exit() on
fatal error design decision. It will be almost impossible to change
this if you don't want to rewrite most of the code.

Actually, they depend upon the "noreturn" feature. It doesn't strictly
have to be exit(), but non-returning functions are in fairly short
supply. exit() is one; longjmp() is another, but it has to get a
jmp_buf from somewhere.

--
Glynn Clements <glynn@gclements.plus.com>

Moritz Lennert wrote:

> No. Just make the digitiser a separate process which imports
> grass.lib.vector etc. If it crashes, you just lose the digitiser
> rather than losing the entire GUI.
>

Jumping in out of no where here, but I would like to understand the
fundamental issue here: would using a separate process imply having to
use a separate display ?

Yes, although the two could be kept synchronised so that you wouldn't
really notice it.

I like the way the new digitizer integrates into the map display, but I
find even more important to not fundamentally modify the logic of grass
as a collection of small modules. So if the former would imply the
latter, then I would prefer dropping that and keeping a separate window
for elements such as the digitzer and nviz.

FWIW, I never really saw the point in making NVIZ use the same window
as the 2D display.

--
Glynn Clements <glynn@gclements.plus.com>

2011/11/17 Glynn Clements <glynn@gclements.plus.com>:

FWIW, I never really saw the point in making NVIZ use the same window
as the 2D display.

because it's just an elegant and intuitive approach from users POV,
and wxGUI has been originally designed for them. I would prefer to
keep the integration effort rather then separating functionalities in
separate windows, etc.

Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa