Hi Glynn,
thanks a lot for your response.
After reading some documentation and asking "silly" questions to
my poor informatics colleagues, i understand the concept of thread local and
the setjmp()/longjmp() approach a bit better.
I would suggest to add longjmp() to G_fatal_error().
It should be set at runtime by an application if longjmp() should be
chosen or not.
So G_fatal_error() will either call longjmp() or exit().
The setjmp() code goes into the application which calls the grass
library functions,
except if nested setjmp()/longjmp() calls are needed in grass to clean
up data, or
to close open file descriptors.
The linux threaded errno definition scared me, so i have chosen a
different approach.
We define thread local support and two extern variables in gis.h to
choose at runtime if
G_fatal_error() will call exit() or longjmp() and to add thread local support.
Example which works for me in my test code:
/*Thread local and setjmp() exception support*/
#include <setjmp.h>
#ifdef WIN32
#define Thread __declspec( thread )
#else
#define Thread __thread
#endif
extern Thread jmp_buf G_stack_buffer; /*to save the most important
CPU register for each thread*/
extern int G_long_run; /*Set to 1 to choose the setjmp() version of
G_fatal_error()*/
The G_long_run variable will be initialized in gisinit.c and so the
G_stack_buff:
int G_long_run;
Thread jmp_buf G_stack_buffer;
...
void G__gisinit(const char *version, const char *pgm)
{
const char *mapset;
if (initialized)
return;
G_long_run = 0;
...
The application has to set G_long_run right after
calling G_gisinit() from a single thread (i.e: a singleton).
Now we need to patch error.c to use longjmp() or exit():
void G_fatal_error(const char *msg, ...)
{
va_list ap;
va_start(ap, msg);
vfprint_error(ERR, msg, ap);
va_end(ap);
if(G_long_run == 1)
longjmp(G_stack_buffer, 1);
else
exit(EXIT_FAILURE);
}
The C++ application code may look like this:
extern "C" {
#include <grass/gis.h>
}
...
int G_long_run;
Thread jmp_buf G_stack_buffer;
vtkGRASSInit::vtkGRASSInit() {
G_gisinit("vtkGRASSBridge");
// Set the long run variable to provide long run support in grass libraries
G_long_run = 1;
}
...
/*Open a vector map*/
...
if(!setjmp(G_stack_buffer))
{
if (1 > Vect_open_new(&this->map, name, with_z))
{
fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.",
this->GetClassName(), __LINE__, name);
return false;
}
} else {
fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.",
this->GetClassName(), __LINE__, name);;
return false;
}
...
That's all.
Is this approach ok or to simple or just naive? 
If this is ok, i would like to test this approach to identify possible
nested setjmp()/longjmp()
calls in libgis, libraster and libvector.
Additionally i will try to make most of the static variable thread local.
Best regards
Soeren
2009/9/27 Glynn Clements <glynn@gclements.plus.com>:
Soeren Gebbert wrote:
> I'm not suggesting making all of the functions take a pointer to the
> state as a parameter, just making it thread-local.
Ok.
To my shame i have to admit that i never heard of the thread-local
mechanism before.
After a quick look at wikipedia i understand the principal and it sounds great!
This will speed up things a lot.
I guess we need to use the pthread version of thread-local to support
other compiler than gcc and windows too?
You would need to conditionalise it. The usual mechanism is like that
used for errno. In a single-threaded implementation, it's just a
variable. In a multi-threaded implementation, it's a macro which
expands to (*errno_location()), where errno_location() retrieves the
address using pthread_getspecific().
> However, the error handling is probably a bigger issue. Pushing error
> handling onto the modules isn't an acceptable solution.
Indeed. This was the next issue i would like to talk about.
> Simply allowing the fatal error handler to longjmp() out then resume
> using the GRASS libraries would be non-trivial, as you would have to
> repair any inconsistencies in the library state.
Is there an alternative to longjmp() and setjmp()?
Not really.
It seems to be quite complex, the man page warns about the usage.
And i never used it before.
longjmp() is conceptually similar to raising an exception in C++,
while setjmp() is equivalent to establishing a try/catch block.
The details are quite simple if you understand how C is implemented in
terms of machine code. setjmp() essentially saves the most important
CPU registers (including the program counter, stack pointer, and frame
pointer), while longjmp() restores them. So setjmp() records the
current execution state while longjmp() restores it (similar to
save/load in a game).
Most of the complexities and warnings relate to potential interactions
with optimisation. Primarily, local variables in the function which
calls setjmp() aren't guaranteed to be restored to the correct value
by longjmp(). gcc warns you if this might occur. Using the "volatile"
qualifier can help here.
The other caveat is that you can't "wrap" setjmp(). The saved state
ceases to become valid once you leave the function which called
setjmp(), so you can only call longjmp() from within a "descendent" of
the function which calls setjmp().
In terms of using it to recover from a fatal error, the usage would be
something along the lines of:
jmp\_buf save;
int my\_handler\(const char \*msg, int fatal\) \{
print\_error\(msg, fatal\);
longjmp\(save, 1\);
return 0; /\* can't happen; longjmp\(\) doesn't return \*/
\}
void main\_loop\(void\) \{
volatile int done;
G\_set\_error\_routine\(my\_handler\);
for \(done = 0; \!done; \) \{
if \(setjmp\(save\) \!= 0\)
continue; /\* fatal error happened \*/
done = do\_next\_action\(\);
\}
G\_unset\_error\_routine\(\);
\}
A common idiom is to call setjmp() in the top-level loop, at the
beginning of each "action", and have the fatal error handler call
longjmp(). If an error occurs during the execution of an action, the
longjmp() will jump back out to the main loop which can then process
the next action.
An example can be found in lib/driver/main.c (in 6.x), where setjmp()
and longjmp() are used to to recover from SIGPIPE, so that the monitor
doesn't terminate if the client terminates prematurely.
> Allowing G_fatal_error() to return is enough work that it can probably
> be ruled out. Apart from changing every single call (I count 520
> references in lib/*), almost every public function would need two
> versions: one which returns an error code and one which treats errors
> as fatal (i.e. only returns upon success).
520 calls are indeed a lot. The raster and gis libraries all together
have 70 calls and
the vector and db libraries have 190 calls.
Glynn, if you can point me to a concrete implementation concept, i
would like to start to patch the gis, raster, vector and db libraries
in grass7.
Maybe we can use signals to set an error variable in the resume error function?
The least invasive approach is to perform clean-up before calling
G_fatal_error(), so that subsequent operations don't crash GRASS, and
rely upon the application registering an error handler which
longjmp()s out.
G_fatal_error() can't be made to return, as that would break all of
the modules which use it. And library functions which don't return on
error can't be changed to return.
You *could* replace existing functions with a wrapper around a version
which returns on error. The original function would be modified to
return a status indication upon error, and the wrapper would just call
the modified version and call G_fatal_error() in the event of an
error. Functions which want to handle the error themselves would call
the lower-level function.
The main problems here are:
1. The sheer number of such functions.
2. The functions may rely upon other functions which currently call
G_fatal_error(). So you would have to make similar changes to the
underlying functions, then modify the calling function to allow for
the fact that these functions can fail.
3. Reporting errors where the error message includes data from local
variables. One option here would be to give the underlying function a
"fatal" parameter, and add a G_error() function which takes an extra
parameter indicating whether to terminate.
All things considered, making it safe to longjmp() out of the fatal
error handler is would be a lot less work.
> The main issue for concurrent reading is that the raster library
> caches the current row, so that upscaling doesn't read and decode each
> row multiple times. That's problematic if you want multiple threads
> reading the same map.
Reading single raster maps in different threads is just great. Everything else
is like icing on the cake.
BTW, you can read the same map from multiple threads provided that you
open it once for each thread.
r.mapcalc only opens each map once, but it uses a mutex to prevent
concurrent access.
--
Glynn Clements <glynn@gclements.plus.com>