[GRASS-dev] Using GRASS in long running and multithreaded applications Was: Re: The tomcat shut down ...

Was: Re: The tomcat shut down when encounter some error grass commond

2009/9/24 Glynn Clements <glynn@gclements.plus.com>:

Soeren Gebbert wrote:

> It isn't just that it's a lot of work to change it, but that it
> potentially causes problems for code which uses the libraries (e.g.
> simply making the variables non-"static" pollutes the global
> namespace).

Sorry, i did not quite understand what "pollute the global namespace"
exactly means,
well i am not a native speaker and not a C expert, so can you please
provide an example? :slight_smile:

If two libraries both export a symbol named e.g. "name", you'll get a
conflict if you use both libraries from the same program. Depending
upon the platform, you may get an error, or you may end up with the
variables being merged into a single variable.

This isn't a problem if the variable isn't exported (i.e. declared
with the "static" qualifier). But you have can't use "static" if the
variable needs to be accessible in more than one file.

Thanks for the explanation, now i understand.

> That could be gotten around by moving all of the static data for each
> library into a single "state" variable. That in itself isn't an issue,
> but you have to remain vigilant for developers adding new static
> variables ad-hoc.

An other approach might be the implementation of reset functions in
every source file which uses
static variables. The reset function can be called for a fine grain
reset approach or all together
in a single reset function which collects all distributed reset functions.

i.e:
New reset functions like:

G_reset_geodesic_equation in lib/gis/geodesic.c
G_reset_window in lib/gis/get_window.c
G_reset_cell_area_calculations in lib/gis/area.c

And so on ...

That will get around the namespace issue.

However, it doesn't really help with multithreading. For that, you

Indeed.

would want to store all of the static data in a library in a state
structure, and give each thread a pointer to a separate state
structure (e.g. pthread_setspecific).

IMHO this approach will result in a re-design and re-write of most of
the grass libraries
and an update of more than 300 modules ... .
Looks like a 2 1/2 years full time job for one developer ... any
sponsors available? :slight_smile:

I think implementing the reset functions is the first step to use
grass in long running applications.
Within those application it must be assured that all the grass library
functions are called from
within the same thread, using a producer-consumer pattern.
Reading and writing of raster and vector maps is performed in serial
by the producer thread
using a queue for all requests from consumer threads. The producer
thread will call the reset functions after each
processed request. A consumer thread will always request full raster
and vector maps. :confused:
So the processing of the data read into the memory will be independent
from the producer thread.
Read and write calls are queued and the consumer threads need to wait
for the producer thread.

Well, that sounds quite complex.

Glynn, if i remember correctly you described a similar approach for a
multi-threaded raster library
some time ago. Are there any plans to implement such an design in the
grass7 raster lib?
I have no experience with such an approach, but i would like to
realize it in a Java application ... someday. :slight_smile:

It would also be great to open raster maps without setting the global
window variable
but providing a pointer of a region to the opening function?
Glynn do you think this is possible without an overhead of work on
other functions?

The main problem here is that the mapping between the region grid and
the raster grid is recalculated whenever you change the region, so
that all maps are read and written at a single resolution. That isn't
a particularly common operation, but it is used (e.g. r.resamp.*) and
needs to be supported in some way.

I see.

This is one area which could do with an overhaul, but it's a
significant architectural change rather than modifying a few
functions.

Thanks for your estimation. I will have a look at the raster io code
to evaluate the effort and which function must be changed or
new implemented without breaking the compatibility.

Best regards
Soeren

--
Glynn Clements <glynn@gclements.plus.com>

Soeren Gebbert wrote:

>> An other approach might be the implementation of reset functions in
>> every source file which uses
>> static variables. The reset function can be called for a fine grain
>> reset approach or all together
>> in a single reset function which collects all distributed reset functions.
>>
>> i.e:
>> New reset functions like:
>>
>> G_reset_geodesic_equation in lib/gis/geodesic.c
>> G_reset_window in lib/gis/get_window.c
>> G_reset_cell_area_calculations in lib/gis/area.c
>>
>> And so on ...
>
> That will get around the namespace issue.
>
> However, it doesn't really help with multithreading. For that, you

Indeed.

> would want to store all of the static data in a library in a state
> structure, and give each thread a pointer to a separate state
> structure (e.g. pthread_setspecific).

IMHO this approach will result in a re-design and re-write of most of
the grass libraries
and an update of more than 300 modules ... .
Looks like a 2 1/2 years full time job for one developer ... any
sponsors available? :slight_smile:

It will require refactoring the libraries, but it doesn't require any
interface changes.

I'm not suggesting making all of the functions take a pointer to the
state as a parameter, just making it thread-local.

I think implementing the reset functions is the first step to use
grass in long running applications.
Within those application it must be assured that all the grass library
functions are called from
within the same thread, using a producer-consumer pattern.

It shouldn't be necessary to be quite that restrictive. 7.0 already
includes changes which make it practical to use multiple threads. You
can't e.g. read or write a single map from multiple threads, but
reading and writing different maps in different threads should work,
as should functions which only query the state (first-use
initialisations are protected by a mutex, so so you don't need to
worry about concurrent read operations both trying to initialise the
state).

However, the error handling is probably a bigger issue. Pushing error
handling onto the modules isn't an acceptable solution.

Simply allowing the fatal error handler to longjmp() out then resume
using the GRASS libraries would be non-trivial, as you would have to
repair any inconsistencies in the library state.

Allowing G_fatal_error() to return is enough work that it can probably
be ruled out. Apart from changing every single call (I count 520
references in lib/*), almost every public function would need two
versions: one which returns an error code and one which treats errors
as fatal (i.e. only returns upon success).

Glynn, if i remember correctly you described a similar approach for a
multi-threaded raster library
some time ago. Are there any plans to implement such an design in the
grass7 raster lib?
I have no experience with such an approach, but i would like to
realize it in a Java application ... someday. :slight_smile:

Built-in multi-threading is currently "blue sky", i.e. I know what the
main issues are but I don't have any plan to solve them.

The main issue for concurrent reading is that the raster library
caches the current row, so that upscaling doesn't read and decode each
row multiple times. That's problematic if you want multiple threads
reading the same map.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn wrote:

The main issue for concurrent reading is that the raster
library caches the current row, so that upscaling doesn't
read and decode each row multiple times. That's problematic
if you want multiple threads reading the same map.

to me multi-threaded multi-map read does not seem so exciting.
typically both maps are in the same GISDBASE, so on the part of
the same disk drive, which can only physically read one thing
at a time, and asking the drive to jump around non-sequentially
to read two maps at once would just slow the overall read down.
I've got nothing against making the libs thread-safe, but
personally I'm much more intersted in preping the processor-bound
modules for multi-threading as I see more gains there vs a
slight speedup in I/O-bound libs.

but then I'm not trying to keep the libs running long term :slight_smile:
and don't know how the vector cleaning tasks fit into this.

Hamish

Hamish wrote:

> The main issue for concurrent reading is that the raster
> library caches the current row, so that upscaling doesn't
> read and decode each row multiple times. That's problematic
> if you want multiple threads reading the same map.

to me multi-threaded multi-map read does not seem so exciting.
typically both maps are in the same GISDBASE, so on the part of
the same disk drive, which can only physically read one thing
at a time, and asking the drive to jump around non-sequentially
to read two maps at once would just slow the overall read down.

That assumes that the data isn't buffered in RAM. Often, the file will
have been created by the previous command, an will stil be in RAM, so
read() is just copying RAM.

Also, the decoding and rescaling overhead may mean that the process
isn't actually I/O bound.

And if you have more than one input map, or you're working on a
multi-user system, the I/O will typically be non-sequential anyhow.

--
Glynn Clements <glynn@gclements.plus.com>

Hello Glynn,

2009/9/26 Glynn Clements <glynn@gclements.plus.com>:

Soeren Gebbert wrote:

>> An other approach might be the implementation of reset functions in
>> every source file which uses
>> static variables. The reset function can be called for a fine grain
>> reset approach or all together
>> in a single reset function which collects all distributed reset functions.
>>
>> i.e:
>> New reset functions like:
>>
>> G_reset_geodesic_equation in lib/gis/geodesic.c
>> G_reset_window in lib/gis/get_window.c
>> G_reset_cell_area_calculations in lib/gis/area.c
>>
>> And so on ...
>
> That will get around the namespace issue.
>
> However, it doesn't really help with multithreading. For that, you

Indeed.

> would want to store all of the static data in a library in a state
> structure, and give each thread a pointer to a separate state
> structure (e.g. pthread_setspecific).

IMHO this approach will result in a re-design and re-write of most of
the grass libraries
and an update of more than 300 modules ... .
Looks like a 2 1/2 years full time job for one developer ... any
sponsors available? :slight_smile:

It will require refactoring the libraries, but it doesn't require any
interface changes.

I'm not suggesting making all of the functions take a pointer to the
state as a parameter, just making it thread-local.

Ok.
To my shame i have to admit that i never heard of the thread-local
mechanism before.
After a quick look at wikipedia i understand the principal and it sounds great!
This will speed up things a lot.
I guess we need to use the pthread version of thread-local to support
other compiler than gcc and windows too?

I think implementing the reset functions is the first step to use
grass in long running applications.
Within those application it must be assured that all the grass library
functions are called from
within the same thread, using a producer-consumer pattern.

It shouldn't be necessary to be quite that restrictive. 7.0 already
includes changes which make it practical to use multiple threads. You
can't e.g. read or write a single map from multiple threads, but
reading and writing different maps in different threads should work,
as should functions which only query the state (first-use
initialisations are protected by a mutex, so so you don't need to
worry about concurrent read operations both trying to initialise the
state).

Great news, i will test this with some java code soon. I have therefor
ported the
vtkGRASSBridge to grass7.

However, the error handling is probably a bigger issue. Pushing error
handling onto the modules isn't an acceptable solution.

Indeed. This was the next issue i would like to talk about.

Simply allowing the fatal error handler to longjmp() out then resume
using the GRASS libraries would be non-trivial, as you would have to
repair any inconsistencies in the library state.

Is there an alternative to longjmp() and setjmp()? It seems to be quite complex,
the man page warns about the usage. And i never used it before.

Allowing G_fatal_error() to return is enough work that it can probably
be ruled out. Apart from changing every single call (I count 520
references in lib/*), almost every public function would need two
versions: one which returns an error code and one which treats errors
as fatal (i.e. only returns upon success).

520 calls are indeed a lot. The raster and gis libraries all together
have 70 calls and
the vector and db libraries have 190 calls.
Glynn, if you can point me to a concrete implementation concept, i
would like to start to patch the gis, raster, vector and db libraries
in grass7.
Maybe we can use signals to set an error variable in the resume error function?

I focus on those libraries, because i have currently applications in
mind which implements raster and vector algorithms in highly multi
threaded environments (servlet or EJB container) and high performance
visualization applications (VTK/Paraview).
I believe the future of grass can be the backbone of many WPS server,
if we can get the core libs thread safe and ready for long running processes.

Glynn, if i remember correctly you described a similar approach for a
multi-threaded raster library
some time ago. Are there any plans to implement such an design in the
grass7 raster lib?
I have no experience with such an approach, but i would like to
realize it in a Java application ... someday. :slight_smile:

Built-in multi-threading is currently "blue sky", i.e. I know what the
main issues are but I don't have any plan to solve them.

The main issue for concurrent reading is that the raster library
caches the current row, so that upscaling doesn't read and decode each
row multiple times. That's problematic if you want multiple threads
reading the same map.

Reading single raster maps in different threads is just great. Everything else
is like icing on the cake.

Best regards
Soeren

--
Glynn Clements <glynn@gclements.plus.com>

Hello Hamish,

2009/9/26 Hamish <hamish_b@yahoo.com>:

Glynn wrote:

The main issue for concurrent reading is that the raster
library caches the current row, so that upscaling doesn't
read and decode each row multiple times. That's problematic
if you want multiple threads reading the same map.

to me multi-threaded multi-map read does not seem so exciting.
typically both maps are in the same GISDBASE, so on the part of
the same disk drive, which can only physically read one thing
at a time, and asking the drive to jump around non-sequentially
to read two maps at once would just slow the overall read down.
I've got nothing against making the libs thread-safe, but
personally I'm much more intersted in preping the processor-bound
modules for multi-threading as I see more gains there vs a
slight speedup in I/O-bound libs.

I think that some algorithms are only parallelizable if the I/O runs
in parallel too.
Especially in case that the algorithms are based on row processing.
Otherwise you need to read huge parts of the raster maps in RAM to benefit from
multithreading.
The gpde and gmath libraries are already partly multi threaded with OpenMP.
Especially the blas level 2 and 3 functions and the krylov-subspace
linear equation solver (cg, pcg and bicgstab).

But you will only have a benefit from multi threading in case the
data you like to process is large enough and the compiler provides
excellent OpenMP support (intel!).
If the data is not large enough the time the threads needs to spawn may take
longer then processing the data (in case you are not using a thread pool).

Most of the time you need to rewrite the algorithm to enable good speedup
on multiple threads, thats the experience i have.

My hope is that grass will benefit from the parallel VTK code using
the vtkGRASSBridge.
The VTK library ships many multithreaded and MPI parallelized image
and vector processing algorithms. AFAIK they support the processing of
tiles on a cluster.

Best regards
Soeren

but then I'm not trying to keep the libs running long term :slight_smile:
and don't know how the vector cleaning tasks fit into this.

Hamish

Soeren Gebbert wrote:

> I'm not suggesting making all of the functions take a pointer to the
> state as a parameter, just making it thread-local.

Ok.
To my shame i have to admit that i never heard of the thread-local
mechanism before.
After a quick look at wikipedia i understand the principal and it sounds great!
This will speed up things a lot.
I guess we need to use the pthread version of thread-local to support
other compiler than gcc and windows too?

You would need to conditionalise it. The usual mechanism is like that
used for errno. In a single-threaded implementation, it's just a
variable. In a multi-threaded implementation, it's a macro which
expands to (*errno_location()), where errno_location() retrieves the
address using pthread_getspecific().

> However, the error handling is probably a bigger issue. Pushing error
> handling onto the modules isn't an acceptable solution.

Indeed. This was the next issue i would like to talk about.

> Simply allowing the fatal error handler to longjmp() out then resume
> using the GRASS libraries would be non-trivial, as you would have to
> repair any inconsistencies in the library state.

Is there an alternative to longjmp() and setjmp()?

Not really.

It seems to be quite complex, the man page warns about the usage.
And i never used it before.

longjmp() is conceptually similar to raising an exception in C++,
while setjmp() is equivalent to establishing a try/catch block.

The details are quite simple if you understand how C is implemented in
terms of machine code. setjmp() essentially saves the most important
CPU registers (including the program counter, stack pointer, and frame
pointer), while longjmp() restores them. So setjmp() records the
current execution state while longjmp() restores it (similar to
save/load in a game).

Most of the complexities and warnings relate to potential interactions
with optimisation. Primarily, local variables in the function which
calls setjmp() aren't guaranteed to be restored to the correct value
by longjmp(). gcc warns you if this might occur. Using the "volatile"
qualifier can help here.

The other caveat is that you can't "wrap" setjmp(). The saved state
ceases to become valid once you leave the function which called
setjmp(), so you can only call longjmp() from within a "descendent" of
the function which calls setjmp().

In terms of using it to recover from a fatal error, the usage would be
something along the lines of:

  jmp_buf save;

  int my_handler(const char *msg, int fatal) {
      print_error(msg, fatal);
      longjmp(save, 1);
      return 0; /* can't happen; longjmp() doesn't return */
  }

  void main_loop(void) {
      volatile int done;
      G_set_error_routine(my_handler);
      for (done = 0; !done; ) {
          if (setjmp(save) != 0)
              continue; /* fatal error happened */
          done = do_next_action();
      }
      G_unset_error_routine();
  }

A common idiom is to call setjmp() in the top-level loop, at the
beginning of each "action", and have the fatal error handler call
longjmp(). If an error occurs during the execution of an action, the
longjmp() will jump back out to the main loop which can then process
the next action.

An example can be found in lib/driver/main.c (in 6.x), where setjmp()
and longjmp() are used to to recover from SIGPIPE, so that the monitor
doesn't terminate if the client terminates prematurely.

> Allowing G_fatal_error() to return is enough work that it can probably
> be ruled out. Apart from changing every single call (I count 520
> references in lib/*), almost every public function would need two
> versions: one which returns an error code and one which treats errors
> as fatal (i.e. only returns upon success).

520 calls are indeed a lot. The raster and gis libraries all together
have 70 calls and
the vector and db libraries have 190 calls.
Glynn, if you can point me to a concrete implementation concept, i
would like to start to patch the gis, raster, vector and db libraries
in grass7.
Maybe we can use signals to set an error variable in the resume error function?

The least invasive approach is to perform clean-up before calling
G_fatal_error(), so that subsequent operations don't crash GRASS, and
rely upon the application registering an error handler which
longjmp()s out.

G_fatal_error() can't be made to return, as that would break all of
the modules which use it. And library functions which don't return on
error can't be changed to return.

You *could* replace existing functions with a wrapper around a version
which returns on error. The original function would be modified to
return a status indication upon error, and the wrapper would just call
the modified version and call G_fatal_error() in the event of an
error. Functions which want to handle the error themselves would call
the lower-level function.

The main problems here are:

1. The sheer number of such functions.

2. The functions may rely upon other functions which currently call
G_fatal_error(). So you would have to make similar changes to the
underlying functions, then modify the calling function to allow for
the fact that these functions can fail.

3. Reporting errors where the error message includes data from local
variables. One option here would be to give the underlying function a
"fatal" parameter, and add a G_error() function which takes an extra
parameter indicating whether to terminate.

All things considered, making it safe to longjmp() out of the fatal
error handler is would be a lot less work.

> The main issue for concurrent reading is that the raster library
> caches the current row, so that upscaling doesn't read and decode each
> row multiple times. That's problematic if you want multiple threads
> reading the same map.

Reading single raster maps in different threads is just great. Everything else
is like icing on the cake.

BTW, you can read the same map from multiple threads provided that you
open it once for each thread.

r.mapcalc only opens each map once, but it uses a mutex to prevent
concurrent access.

--
Glynn Clements <glynn@gclements.plus.com>

Hi Glynn,
thanks a lot for your response.
After reading some documentation and asking "silly" questions to
my poor informatics colleagues, i understand the concept of thread local and
the setjmp()/longjmp() approach a bit better.

I would suggest to add longjmp() to G_fatal_error().
It should be set at runtime by an application if longjmp() should be
chosen or not.
So G_fatal_error() will either call longjmp() or exit().

The setjmp() code goes into the application which calls the grass
library functions,
except if nested setjmp()/longjmp() calls are needed in grass to clean
up data, or
to close open file descriptors.

The linux threaded errno definition scared me, so i have chosen a
different approach.
We define thread local support and two extern variables in gis.h to
choose at runtime if
G_fatal_error() will call exit() or longjmp() and to add thread local support.

Example which works for me in my test code:

/*Thread local and setjmp() exception support*/
#include <setjmp.h>
#ifdef WIN32
#define Thread __declspec( thread )
#else
#define Thread __thread
#endif

extern Thread jmp_buf G_stack_buffer; /*to save the most important
CPU register for each thread*/
extern int G_long_run; /*Set to 1 to choose the setjmp() version of
G_fatal_error()*/

The G_long_run variable will be initialized in gisinit.c and so the
G_stack_buff:

int G_long_run;
Thread jmp_buf G_stack_buffer;
...
void G__gisinit(const char *version, const char *pgm)
{
    const char *mapset;

    if (initialized)
        return;

     G_long_run = 0;
...

The application has to set G_long_run right after
calling G_gisinit() from a single thread (i.e: a singleton).
Now we need to patch error.c to use longjmp() or exit():

void G_fatal_error(const char *msg, ...)
{
    va_list ap;

    va_start(ap, msg);
    vfprint_error(ERR, msg, ap);
    va_end(ap);

    if(G_long_run == 1)
       longjmp(G_stack_buffer, 1);
    else
       exit(EXIT_FAILURE);
}

The C++ application code may look like this:

extern "C" {
#include <grass/gis.h>
}

...
int G_long_run;
Thread jmp_buf G_stack_buffer;

vtkGRASSInit::vtkGRASSInit() {

     G_gisinit("vtkGRASSBridge");

   // Set the long run variable to provide long run support in grass libraries
   G_long_run = 1;
}
...
/*Open a vector map*/

...
  if(!setjmp(G_stack_buffer))
  {
    if (1 > Vect_open_new(&this->map, name, with_z))
    {
      fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.",
             this->GetClassName(), __LINE__, name);
      return false;
    }
  } else {
    fprintf(stderr, "class: %s line: %i Unable to open vector map <%s>.",
           this->GetClassName(), __LINE__, name);;
    return false;
  }
...

That's all.

Is this approach ok or to simple or just naive? :slight_smile:

If this is ok, i would like to test this approach to identify possible
nested setjmp()/longjmp()
calls in libgis, libraster and libvector.

Additionally i will try to make most of the static variable thread local.

Best regards
Soeren

2009/9/27 Glynn Clements <glynn@gclements.plus.com>:

Soeren Gebbert wrote:

> I'm not suggesting making all of the functions take a pointer to the
> state as a parameter, just making it thread-local.

Ok.
To my shame i have to admit that i never heard of the thread-local
mechanism before.
After a quick look at wikipedia i understand the principal and it sounds great!
This will speed up things a lot.
I guess we need to use the pthread version of thread-local to support
other compiler than gcc and windows too?

You would need to conditionalise it. The usual mechanism is like that
used for errno. In a single-threaded implementation, it's just a
variable. In a multi-threaded implementation, it's a macro which
expands to (*errno_location()), where errno_location() retrieves the
address using pthread_getspecific().

> However, the error handling is probably a bigger issue. Pushing error
> handling onto the modules isn't an acceptable solution.

Indeed. This was the next issue i would like to talk about.

> Simply allowing the fatal error handler to longjmp() out then resume
> using the GRASS libraries would be non-trivial, as you would have to
> repair any inconsistencies in the library state.

Is there an alternative to longjmp() and setjmp()?

Not really.

It seems to be quite complex, the man page warns about the usage.
And i never used it before.

longjmp() is conceptually similar to raising an exception in C++,
while setjmp() is equivalent to establishing a try/catch block.

The details are quite simple if you understand how C is implemented in
terms of machine code. setjmp() essentially saves the most important
CPU registers (including the program counter, stack pointer, and frame
pointer), while longjmp() restores them. So setjmp() records the
current execution state while longjmp() restores it (similar to
save/load in a game).

Most of the complexities and warnings relate to potential interactions
with optimisation. Primarily, local variables in the function which
calls setjmp() aren't guaranteed to be restored to the correct value
by longjmp(). gcc warns you if this might occur. Using the "volatile"
qualifier can help here.

The other caveat is that you can't "wrap" setjmp(). The saved state
ceases to become valid once you leave the function which called
setjmp(), so you can only call longjmp() from within a "descendent" of
the function which calls setjmp().

In terms of using it to recover from a fatal error, the usage would be
something along the lines of:

   jmp\_buf save;

   int my\_handler\(const char \*msg, int fatal\) \{
       print\_error\(msg, fatal\);
       longjmp\(save, 1\);
       return 0; /\* can&#39;t happen; longjmp\(\) doesn&#39;t return \*/
   \}

   void main\_loop\(void\) \{
       volatile int done;
       G\_set\_error\_routine\(my\_handler\);
       for \(done = 0; \!done; \) \{
           if \(setjmp\(save\) \!= 0\)
               continue;   /\* fatal error happened \*/
           done = do\_next\_action\(\);
       \}
       G\_unset\_error\_routine\(\);
   \}

A common idiom is to call setjmp() in the top-level loop, at the
beginning of each "action", and have the fatal error handler call
longjmp(). If an error occurs during the execution of an action, the
longjmp() will jump back out to the main loop which can then process
the next action.

An example can be found in lib/driver/main.c (in 6.x), where setjmp()
and longjmp() are used to to recover from SIGPIPE, so that the monitor
doesn't terminate if the client terminates prematurely.

> Allowing G_fatal_error() to return is enough work that it can probably
> be ruled out. Apart from changing every single call (I count 520
> references in lib/*), almost every public function would need two
> versions: one which returns an error code and one which treats errors
> as fatal (i.e. only returns upon success).

520 calls are indeed a lot. The raster and gis libraries all together
have 70 calls and
the vector and db libraries have 190 calls.
Glynn, if you can point me to a concrete implementation concept, i
would like to start to patch the gis, raster, vector and db libraries
in grass7.
Maybe we can use signals to set an error variable in the resume error function?

The least invasive approach is to perform clean-up before calling
G_fatal_error(), so that subsequent operations don't crash GRASS, and
rely upon the application registering an error handler which
longjmp()s out.

G_fatal_error() can't be made to return, as that would break all of
the modules which use it. And library functions which don't return on
error can't be changed to return.

You *could* replace existing functions with a wrapper around a version
which returns on error. The original function would be modified to
return a status indication upon error, and the wrapper would just call
the modified version and call G_fatal_error() in the event of an
error. Functions which want to handle the error themselves would call
the lower-level function.

The main problems here are:

1. The sheer number of such functions.

2. The functions may rely upon other functions which currently call
G_fatal_error(). So you would have to make similar changes to the
underlying functions, then modify the calling function to allow for
the fact that these functions can fail.

3. Reporting errors where the error message includes data from local
variables. One option here would be to give the underlying function a
"fatal" parameter, and add a G_error() function which takes an extra
parameter indicating whether to terminate.

All things considered, making it safe to longjmp() out of the fatal
error handler is would be a lot less work.

> The main issue for concurrent reading is that the raster library
> caches the current row, so that upscaling doesn't read and decode each
> row multiple times. That's problematic if you want multiple threads
> reading the same map.

Reading single raster maps in different threads is just great. Everything else
is like icing on the cake.

BTW, you can read the same map from multiple threads provided that you
open it once for each thread.

r.mapcalc only opens each map once, but it uses a mutex to prevent
concurrent access.

--
Glynn Clements <glynn@gclements.plus.com>

Soeren Gebbert wrote:

After reading some documentation and asking "silly" questions to
my poor informatics colleagues, i understand the concept of thread local and
the setjmp()/longjmp() approach a bit better.

I would suggest to add longjmp() to G_fatal_error().
It should be set at runtime by an application if longjmp() should be
chosen or not.
So G_fatal_error() will either call longjmp() or exit().

There's no need to modify G_fatal_error(). The application can install
an error handler with G_set_error_routine(), and the error handler can
use longjmp() to avoid returning.

The linux threaded errno definition scared me, so i have chosen a
different approach.
We define thread local support and two extern variables in gis.h to
choose at runtime if
G_fatal_error() will call exit() or longjmp() and to add thread local support.

Example which works for me in my test code:

/*Thread local and setjmp() exception support*/
#include <setjmp.h>
#ifdef WIN32
#define Thread __declspec( thread )
#else
#define Thread __thread
#endif

The __thread qualifier is a gcc extension.

extern Thread jmp_buf G_stack_buffer; /*to save the most important
CPU register for each thread*/

In order to use a GRASS library from multiple threads, the entire
library state needs to be thread-specific. That would include the
current error handler.

Also: do you actually need to resume a thread in which an error
occurs? If not, you can just make the error handler terminate the
current thread. That will prevent the error handler from returning and
thus prevent exit() being called.

Is this approach ok or to simple or just naive? :slight_smile:

It's too invasive. The longjmp() should go into an application-defined
error handler, rather than the GRASS libraries.

The only changes to the GRASS libraries regarding error handling
should be to ensure that it is safe to continue using them after the
application recovers from a fatal error.

Supporting thread-local state requires some changes to the GRASS
libraries, i.e. moving most[1] static variables for a library into a
state structure, and referencing all such variables through a pointer.

[1] Some variables will contain immutable data, e.g. tables which are
read from files. These don't need to be initialised for each thread,
although concurrent initialisation has to be protected against.

If this is ok, i would like to test this approach to identify possible
nested setjmp()/longjmp()
calls in libgis, libraster and libvector.

I'd prefer to avoid putting any setjmp()s into the GRASS libraries.
I'd rather see low-level functions split into an "internal" version
which returns an error status, and a "public" version which calls
G_fatal_error(), and have intermediate functions use the internal
version if they need to catch errors.

Ultimately, the GRASS libraries exist for the benefit of GRASS
modules. Minor changes to facilitate other uses (e.g. persistent
applications) are reasonable, but I'm opposed to substantial
architectural changes, particularly if it makes the coding process
more complex.

--
Glynn Clements <glynn@gclements.plus.com>

Hello,

Example which works for me in my test code:

/*Thread local and setjmp() exception support*/
#include <setjmp.h>
#ifdef WIN32
#define Thread __declspec( thread )
#else
#define Thread __thread
#endif

The __thread qualifier is a gcc extension.

Yes, wikipedia says it works for:
Sun Studio C/C++, IBM XL C/C++, GNU C and Intel C/C++ (Linux systems)

Using the pthreads implementation will be a better solution?

extern Thread jmp_buf G_stack_buffer; /*to save the most important
CPU register for each thread*/

In order to use a GRASS library from multiple threads, the entire
library state needs to be thread-specific. That would include the
current error handler.

Also: do you actually need to resume a thread in which an error
occurs? If not, you can just make the error handler terminate the
current thread. That will prevent the error handler from returning and
thus prevent exit() being called.

I am not sure about this. If the grass libs are called from within an
EJB container, then there will be no way to terminate the current
thread. But it should work from within a servlet container or a RCP
container.

Is this approach ok or to simple or just naive? :slight_smile:

It's too invasive. The longjmp() should go into an application-defined
error handler, rather than the GRASS libraries.

Ok. The error handler looks like that for now:

int vgb_error_handler(const char *msg, int fatal)
{
    if (fatal == 0)
    {
        fprintf(stderr, "%s\n", msg);
        return 1;
    }
    else if (fatal == 1)
    {
        fprintf(stderr, "WARNING: %s\n", msg);
        longjmp(vgb_stack_buffer, 1);
    }
    else if (fatal == 2)
    {
        fprintf(stderr, "ERROR: %s\n", msg);
        longjmp(vgb_stack_buffer, 2);
    }
    return 1;
}

vtkGRASSInit::vtkGRASSInit()
{
    G_gisinit("vtkGRASSInit");
    // Set the error routine
    G_set_error_routine(vgb_error_handler);
}

The only changes to the GRASS libraries regarding error handling
should be to ensure that it is safe to continue using them after the
application recovers from a fatal error.

Supporting thread-local state requires some changes to the GRASS
libraries, i.e. moving most[1] static variables for a library into a
state structure, and referencing all such variables through a pointer.
I'd prefer to avoid putting any setjmp()s into the GRASS libraries.
I'd rather see low-level functions split into an "internal" version
which returns an error status, and a "public" version which calls
G_fatal_error(), and have intermediate functions use the internal
version if they need to catch errors.

Ok, i agree. I will implement the setjmp()/longjmp() only within my application.

Ultimately, the GRASS libraries exist for the benefit of GRASS
modules. Minor changes to facilitate other uses (e.g. persistent
applications) are reasonable, but I'm opposed to substantial
architectural changes, particularly if it makes the coding process
more complex.

Very much agreed.

Thanks
Soeren

--
Glynn Clements <glynn@gclements.plus.com>

Soeren Gebbert wrote:

>> Example which works for me in my test code:
>>
>> /*Thread local and setjmp() exception support*/
>> #include <setjmp.h>
>> #ifdef WIN32
>> #define Thread __declspec( thread )
>> #else
>> #define Thread __thread
>> #endif
>
> The __thread qualifier is a gcc extension.

Yes, wikipedia says it works for:
Sun Studio C/C++, IBM XL C/C++, GNU C and Intel C/C++ (Linux systems)

Using the pthreads implementation will be a better solution?

Any thread-local state needs to be conditionalised upon the actual
threading mechanism used, with a fall-back to a single instance (i.e.
a simple variable).

>> Is this approach ok or to simple or just naive? :slight_smile:
>
> It's too invasive. The longjmp() should go into an application-defined
> error handler, rather than the GRASS libraries.

Ok. The error handler looks like that for now:

[snip]

Yep; that's the general idea.

--
Glynn Clements <glynn@gclements.plus.com>