[GRASS-dev] grass is a monad?

Hi all.
Reading the recent post of Glynn:

Right now, I'm actively trying to think of ways to make life harder for anyone trying
to use the GRASS libraries for anything except GRASS modules.
http://trac.osgeo.org/grass/ticket/869#comment:1

I wonder if this is somehow a view shared by GRASS-PSC, and I ask myself what would
be the advantage of having GRASS as an isolate piece of software.
I would greatly appreciate devs opinions on this.
All the best.
--
Paolo Cavallini: http://www.faunalia.it/pc

On Thu, 14 Jan 2010, Paolo Cavallini wrote:

Hi all.
Reading the recent post of Glynn:

Right now, I'm actively trying to think of ways to make life harder for anyone trying
to use the GRASS libraries for anything except GRASS modules.
http://trac.osgeo.org/grass/ticket/869#comment:1

I wonder if this is somehow a view shared by GRASS-PSC, and I ask myself what would
be the advantage of having GRASS as an isolate piece of software.
I would greatly appreciate devs opinions on this.

Well, Glynn's comment is clearly (as I see it anyway!) meant to be light-hearted/sarcastic. But it has some basis. The idea is not that GRASS should not be used be other projects, but we encourage other projects to use it by running GRASS modules - not by linking against the GRASS internal libraries directly. In the past few years a massive amount of work has gone into making the GRASS modules more Unix-like (do one thing simply and do it well, with no interactivity) - and while this annoyed me a bit for a while, I think it has opened up so many more opportunities for use of GRASS modules as a backend to other systems (e.g. the GRASS GUI) that it is a very good thing.

So, I feel the idea behind Glynn's comments (and one that I guess I would agree with) is that we encourage other projects to use GRASS by calling the modules directly. As GRASS has so few developers compared to the massive body of code, putting in extra development time to make GRASS work in ways that have no benefit for GRASS is just too much effort.

Best regards

Paul

Paul Kelly wrote:

So, I feel the idea behind Glynn's comments (and one that I guess I would
agree with) is that we encourage other projects to use GRASS by calling
the modules directly. As GRASS has so few developers compared to the
massive body of code, putting in extra development time to make GRASS work
in ways that have no benefit for GRASS is just too much effort.

That's about right.

Maybe I can elaborate:

1. The GRASS libraries handle errors by calling exit(). This avoids
the need for callers to explicitly handle errors and for either the
caller or callee to perform cleanup.

It's possible to avoid the call to exit() by installing an error
handler which longjmp()s out; However: it is implied that calling
G_fatal_error() relieves the caller of the responsibility to restore
GRASS data structures to a consistent state. If you longjmp() out of
the error handler, then make further calls to GRASS functions, they
may crash.

This is Not A Bug. Calling into GRASS after a fatal error *is* a bug.

2. The GRASS libraries don't bother tracking (and freeing) memory
which would only represent a fixed overhead for a GRASS module, or
which would typically not be freed until shortly before termination.

If an operation uses a small amount of memory (e.g. a map name) and is
typically performed once or a few times per map, we don't care about
freeing it. A module will inevitably need some amount of memory for
each map which it opens, and if the difference between freeing and not
freeing means that the amount is 15Kb rather than 10Kb, then it
doesn't really matter.

It all gets returned to the OS when the module terminates anyhow.

If you try to write a persistent application which continually opens
and closes maps, eventually the accumulated leaks will cause it to run
out of memory.

This too is Not A Bug. Not isolating distinct operations in distinct
processes *is* a bug.

Both of these strategies make it much easier to both implement the
GRASS libraries and to use them, and they work perfectly well for the
libraries' intended purpose: writing GRASS modules.

If it means that they aren't suitable for other purposes, that is
Not A Bug.

For any changes, the main concern is how those changes impact GRASS
itself. If the changes impose extra effort on anyone writing GRASS
modules, or (to a lesser extent) anyone modifiying the libraries, then
those changes would be a net loss for GRASS.

--
Glynn Clements <glynn@gclements.plus.com>

Some rumbling form bad code author.
Still I don't see why GRASS modules would not benefit from ability to
clean up after themselves in case of failure. Removing temporary maps,
closing open DB connections etc. are good reasons why GRASS modules
should do error checking by themselves, unless it IS a FATAL error
(one that means serious module code error or something terribly wrong
with data files/software/hardware when any future action might result
in even larger disaster).
IMHO decisions when and how to push the Big Red button should be left
to module authors. Library functions returning i.e. -1 is not so big
issue to check in module code and then call lethal error there
(possibly after rm'ing temporary files etc.).

Just my 0.002.
Maris.

2010/1/14, Glynn Clements <glynn@gclements.plus.com>:

Paul Kelly wrote:

So, I feel the idea behind Glynn's comments (and one that I guess I would
agree with) is that we encourage other projects to use GRASS by calling
the modules directly. As GRASS has so few developers compared to the
massive body of code, putting in extra development time to make GRASS work

in ways that have no benefit for GRASS is just too much effort.

That's about right.

Maybe I can elaborate:

1. The GRASS libraries handle errors by calling exit(). This avoids
the need for callers to explicitly handle errors and for either the
caller or callee to perform cleanup.

It's possible to avoid the call to exit() by installing an error
handler which longjmp()s out; However: it is implied that calling
G_fatal_error() relieves the caller of the responsibility to restore
GRASS data structures to a consistent state. If you longjmp() out of
the error handler, then make further calls to GRASS functions, they
may crash.

This is Not A Bug. Calling into GRASS after a fatal error *is* a bug.

2. The GRASS libraries don't bother tracking (and freeing) memory
which would only represent a fixed overhead for a GRASS module, or
which would typically not be freed until shortly before termination.

If an operation uses a small amount of memory (e.g. a map name) and is
typically performed once or a few times per map, we don't care about
freeing it. A module will inevitably need some amount of memory for
each map which it opens, and if the difference between freeing and not
freeing means that the amount is 15Kb rather than 10Kb, then it
doesn't really matter.

It all gets returned to the OS when the module terminates anyhow.

If you try to write a persistent application which continually opens
and closes maps, eventually the accumulated leaks will cause it to run
out of memory.

This too is Not A Bug. Not isolating distinct operations in distinct
processes *is* a bug.

Both of these strategies make it much easier to both implement the
GRASS libraries and to use them, and they work perfectly well for the
libraries' intended purpose: writing GRASS modules.

If it means that they aren't suitable for other purposes, that is
Not A Bug.

For any changes, the main concern is how those changes impact GRASS
itself. If the changes impose extra effort on anyone writing GRASS
modules, or (to a lesser extent) anyone modifiying the libraries, then
those changes would be a net loss for GRASS.

--
Glynn Clements <glynn@gclements.plus.com>
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Maris Nartiss wrote:

Some rumbling form bad code author.
Still I don't see why GRASS modules would not benefit from ability to
clean up after themselves in case of failure. Removing temporary maps,
closing open DB connections etc. are good reasons why GRASS modules
should do error checking by themselves, unless it IS a FATAL error
(one that means serious module code error or something terribly wrong
with data files/software/hardware when any future action might result
in even larger disaster).

Most clean-up is done automatically by the OS when the process
terminates. The main exception is that files won't be deleted. For
files which are part of the GRASS database, the libraries should
handle this.

Modules can use G_set_error_routine() to install a fatal error handler
if they want to perform clean-up. This is why that function exists
(i.e. to perform clean-up prior to termination, *not* for trying to
prevent termination).

If this isn't sufficient, please suggest improvements.

IMHO decisions when and how to push the Big Red button should be left
to module authors. Library functions returning i.e. -1 is not so big
issue to check in module code and then call lethal error there
(possibly after rm'ing temporary files etc.).

That puts a lot of work onto module and library authors.

When I recently changed Rast_open_* and several other core functions
to generate fatal errors rather than reporting an error indication, I
changed everything which called them to not check for an error
indication (which cannot occur now). A significant number of modules
didn't actually check for any error indication, so if the call failed,
they would go on to try to read/write data from an invalid map or use
the contents of a "struct Cell_head" which wasn't actually
initialised.

Returning a status also means that each library function must check
the status and report it back to the caller, all the way up to the top
level. The caller then only knows that an error occured; it doesn't
know where it occured or exactly what occurred, so it cannot report
the details to the caller.

Historically, C has used this mechanism due to the absence of any
alternative. The main downside is that error handling can often
account for the majority of the code, particularly if you end up
having to write:

  a = foo();
  if (a < 0)
     error(...);
  b = bar();
  if (b < 0)
    error(...);
  x = a + b;

rather than just:

  x = foo() + bar();

Modern languages invariably have an exception mechanism which allows
an exception to be raised in a low-level function and caught at the
higer levels without burdening all of the intermediate levels with the
details, while allowing the top level to find out the precise nature
of the error.

If it wasn't for the fact that C++ code takes roughly five times as
long to compile as C, I'd advocate converting GRASS to C++ solely so
that we can use exceptions and RAII (I don't see any need for OO, and
if there was, it's easy enough to implement in C).

It's possible to implement exception mechanisms in C using setjmp()
and longjmp(). However, the syntax is inevitably ugly, you have to
manually restore the exception handling on early returns, and you
can't use destructors for clean-up.

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, Jan 14, 2010 at 10:35 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

This too is Not A Bug. Not isolating distinct operations in distinct
processes *is* a bug.

I can imagine almost everything done via GRASS modules. Yes, it is
annoying to run a GRASS module (probably I have to write a new one to
be sure, that the output/options do not change in next minor GRASS
release) and to parse output instead of just calling a function. You
did your work well, you made our life harder.

Everything except vector editing. I don't see any reasonable
possibility to do interactive vector editing via a GRASS module. For
vector editing, I need immediate response which is visualized on
display (e.g. if a vertex was moved and area topology was broken it
must be displayed). It is impossible to open/close a vector for every
single operation, it would take too long time for larger vectors.

Radim

Radim Blazek wrote:

> This too is Not A Bug. Not isolating distinct operations in distinct
> processes *is* a bug.

I can imagine almost everything done via GRASS modules. Yes, it is
annoying to run a GRASS module (probably I have to write a new one to
be sure, that the output/options do not change in next minor GRASS
release)

Even if you use an existing module, the module interface is less
likely to change than a library function. Certainly, the changes to
the modules between 6.x and 7.0 are quite minor compared to the
changes to many library functions (e.g. half of G_* being renamed to
Rast_*, R_* disappearing, ...).

and to parse output instead of just calling a function. You
did your work well, you made our life harder.

Everything except vector editing. I don't see any reasonable
possibility to do interactive vector editing via a GRASS module. For
vector editing, I need immediate response which is visualized on
display (e.g. if a vertex was moved and area topology was broken it
must be displayed). It is impossible to open/close a vector for every
single operation, it would take too long time for larger vectors.

For this specific case, is it possible to isolate a subset of the
vector library such that it can be used reasonably by both GRASS and
QGIS (and the wxGUI's vdigit module, which is just as bad in this
regard)?

Also, what kind of fatal errors can occur while in the middle of
vector editing that could reasonably be recovered from?

I'm not particularly averse to libraries using status returns, so long
as this doesn't:

1. propagate up to the API used by modules,
2. "infest" a large portion of the GRASS libraries, or
3. mean that fatal errors end up being signalled at the highest levels
after most of the context has been lost.

--
Glynn Clements <glynn@gclements.plus.com>

On Mon, Jan 18, 2010 at 3:33 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Everything except vector editing. I don't see any reasonable
possibility to do interactive vector editing via a GRASS module. For
vector editing, I need immediate response which is visualized on
display (e.g. if a vertex was moved and area topology was broken it
must be displayed). It is impossible to open/close a vector for every
single operation, it would take too long time for larger vectors.

For this specific case, is it possible to isolate a subset of the
vector library such that it can be used reasonably by both GRASS and
QGIS (and the wxGUI's vdigit module, which is just as bad in this
regard)?

The subset would be almost the whole vector library (Vlib,diglib,rtree).

Also, what kind of fatal errors can occur while in the middle of
vector editing that could reasonably be recovered from?

Probably I don't understand the question, for any error I would prefer
to give a decent message
(in window box) and stop editing (even with corrupted vector) but
don't crash the application.

I'm not particularly averse to libraries using status returns, so long
as this doesn't:

1. propagate up to the API used by modules,

So you would suggest a separate set of functions for modules and QGIS/vdigit?

2. "infest" a large portion of the GRASS libraries, or
3. mean that fatal errors end up being signalled at the highest levels
after most of the context has been lost.

What is wrong on calling G_fatal_error on each level from the first
function where the error happened to the top and optionally (i.e.
modules wants exit, qgis/vdigit return value) either only print the
message (call error routine) and return error code or exit (it means
on the lowest level).

Radim

Radim Blazek wrote:

> Also, what kind of fatal errors can occur while in the middle of
> vector editing that could reasonably be recovered from?

Probably I don't understand the question, for any error I would prefer
to give a decent message
(in window box) and stop editing (even with corrupted vector) but
don't crash the application.

The GRASS way to achieve this is to make the editor a separate
process.

> I'm not particularly averse to libraries using status returns, so long
> as this doesn't:
>
> 1. propagate up to the API used by modules,

So you would suggest a separate set of functions for modules and QGIS/vdigit?

Actually, I would suggest QGIS/vdigit isolating actions within a child
process, rather than complicating the rest of GRASS for the sake of
two programs, one of which isn't part of GRASS and the other was
designed incorrectly from the outset.

But if there are reasons for specific functions to have status
returns, then I would suggest also having a wrapper which signals a
fatal error.

Doing this wholesale would make too much of a mess, though.

> 2. "infest" a large portion of the GRASS libraries, or
> 3. mean that fatal errors end up being signalled at the highest levels
> after most of the context has been lost.

What is wrong on calling G_fatal_error on each level from the first
function where the error happened to the top and optionally (i.e.
modules wants exit, qgis/vdigit return value) either only print the
message (call error routine) and return error code or exit (it means
on the lowest level).

Exceptions would be nice, but implementing them in C is a mess and
re-writing GRASS in C++ isn't an option.

Removing the __attribute__((noreturn)) from G_fatal_error() requires a
substantial re-write, and isn't an option IMHO.

--
Glynn Clements <glynn@gclements.plus.com>

On Mon, Feb 8, 2010 at 11:03 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Actually, I would suggest QGIS/vdigit isolating actions within a child
process, rather than complicating the rest of GRASS for the sake of
two programs, one of which isn't part of GRASS and the other was
designed incorrectly from the outset.

That would mean tho have a communication infrastructure
(protocol,library) for sending data between either digit tool and main
application or between main application and some sort of digit server.
And that is a lot of work. More than to support returning error codes
in vector lib where necessary IMO. So well, less work in GRASS much
more work for QGIS, vdigit, GDAL and via GDAL Mapserver, R or other
applications. Consider also that those applications are written and
working so changing them completely is huge amount of work. I
understand your point of view but I don't agree with.

All this started with -fexception which can solve the problem (on cost
of some memory leaks).
I think that the easiest solution fo QGIS and GDAL is to create a new
project "Exceptional GRASS" as a copy of GRASS core libs with Lib.make
patched to compile with -fexception.
There was once libgrass already, IIRC, the support for fatal error
handler was introduced there for R.

Radim