[GRASS-dev] [GRASS GIS] #2127: Python implementation of g.message

#2127: Python implementation of g.message
-------------------------+--------------------------------------------------
Reporter: huhabla | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Keywords: | Platform: Unspecified
      Cpu: All |
-------------------------+--------------------------------------------------
The Python grass script library uses g.message to provide warning, error,
debug, verbose, info and percent messages. In case these messages are
called many times (> 100), the overhead of calling g.message rises and can
slow the actual processing massively down.

I would suggest to implement the behavior of g.message directly in Python
to reduce the overhead, replacing the functions that make use of
g.message:

  * grass.core.message()
  * grass.core.debug()
  * grass.core.verbose()
  * grass.core.info()
  * grass.core.percent()
  * grass.core.error()
  * grass.core.warning()

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127&gt;
GRASS GIS <http://grass.osgeo.org>

#2127: Python implementation of g.message
-------------------------+--------------------------------------------------
Reporter: huhabla | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Keywords: | Platform: Unspecified
      Cpu: All |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [ticket:2127 huhabla]:

> In case these messages are called many times (> 100), the overhead of
calling g.message rises and can slow the actual processing massively down.

Why would you call them so many times?

I can just about understand it for debug(), in which case it might be
better to use native Python equivalents (e.g. the `logging` module). If
you're calling anything else >100 times (even verbose()), the script is
probably too chatty.

> I would suggest to implement the behavior of g.message directly in
Python to reduce the overhead, replacing the functions that make use of
g.message:

Then we would need to keep the two in sync.

G_message() etc aren't exactly trivial; they support multiple output
formats, word-wrapping, configution of verbosity and output format via
environment variables and command-line switches, and reporting of messages
via stderr, log file and/or email.

If performance is a genuine issue, I would rather see g.message enhanced
so that it can be used as a server, with the script spawning a single
persistent g.message process which can accept multiple messages (of
varying priorities) read from stdin.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#2127: Python implementation of g.message
-------------------------+--------------------------------------------------
Reporter: huhabla | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Keywords: | Platform: Unspecified
      Cpu: All |
-------------------------+--------------------------------------------------

Comment(by wenzeslaus):

Replying to [comment:1 glynn]:
> Replying to [ticket:2127 huhabla]:
>
> > In case these messages are called many times (> 100), the overhead of
calling g.message rises and can slow the actual processing massively down.
>
> Why would you call them so many times?
>
> I can just about understand it for debug(), in which case it might be
better to use native Python equivalents (e.g. the `logging` module). If
you're calling anything else >100 times (even verbose()), the script is
probably too chatty.
>
> > I would suggest to implement the behavior of g.message directly in
Python to reduce the overhead, replacing the functions that make use of
g.message:
>
> Then we would need to keep the two in sync.
>
> G_message() etc aren't exactly trivial; they support multiple output
formats, word-wrapping, configution of verbosity and output format via
environment variables and command-line switches, and reporting of messages
via stderr, log file and/or email.
>

This is certainly an issue. Python `logging` module could help in
implementing the functionality but it is still necessary to implement
GRASS interface (the environmental variable, etc.).

It seems to me that better option is to use `G_message()` etc through
`ctypes`. There is always some complexity in using `ctypes` but we need
them work anyway for many things (some scripts, pygrass, parts of GUI), so
making them necessary for all scripts is not such an issue from my point
of view.

> If performance is a genuine issue, I would rather see g.message enhanced
so that it can be used as a server, with the script spawning a single
persistent g.message process which can accept multiple messages (of
varying priorities) read from stdin.

Although, this approach might be beneficial at more places in GRASS
(mainly GUI-related things (`g.message` is actually GUI thing too)) and I
would like to have a nice way how to do it when necessary, I don't think
that it is better than `ctypes`. It think `ctypes` is a better option.

I think that it is fragile as well as `ctypes`. The difference, I can see,
is the number of created processes. When we speak about call of one Python
module there is no such a difference. One `g.message` server process
versus zero when using `ctypes`. But when we speak about calling Python
module many times (> 100), we have now a lot of `g.message` server
processes versus zero in case of `ctypes`.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#2127: Python implementation of g.message
-------------------------+--------------------------------------------------
Reporter: huhabla | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Keywords: | Platform: Unspecified
      Cpu: All |
-------------------------+--------------------------------------------------

Comment(by huhabla):

Replying to [comment:1 glynn]:
> Replying to [ticket:2127 huhabla]:
>
> > In case these messages are called many times (> 100), the overhead of
calling g.message rises and can slow the actual processing massively down.
>
> Why would you call them so many times?

I would like to add debug messages, verbose messages and the percentage
output to many processing steps in the temporal framework, so that the
user can follow the processing of time stamped maps. I usually handle
hundreds to many thousands of maps. In the current message approach
g.message is called every single step to evaluate the debug, verbosity
level and so on.

>
> I can just about understand it for debug(), in which case it might be
better to use native Python equivalents (e.g. the `logging` module). If
you're calling anything else >100 times (even verbose()), the script is
probably too chatty.
>
> > I would suggest to implement the behavior of g.message directly in
Python to reduce the overhead, replacing the functions that make use of
g.message:
>
> Then we would need to keep the two in sync.
>
> G_message() etc aren't exactly trivial; they support multiple output
formats, word-wrapping, configution of verbosity and output format via
environment variables and command-line switches, and reporting of messages
via stderr, log file and/or email.
>
> If performance is a genuine issue, I would rather see g.message enhanced
so that it can be used as a server, with the script spawning a single
persistent g.message process which can accept multiple messages (of
varying priorities) read from stdin.

That is a great idea indeed. I would suggest to implement this message
server using Python that calls the G_message() functions using ctypes.
This can be part of PyGRASS that implements the server process and the
client access functions that sends text messages to the sever process that
writes to stderr.

Example interface:
{{{
import grass.pygrass.messages as messages

# Create the messenger object that starts the server process.
# As Glynn said, the server accepts multiple messages with different
priorities.
msgr = messages.Messenger()

# In addition to the stdout/stderr output, the server can write the
messages into a logfile
msgr.set_logfile("logfile.txt")

# Send an info message to the server, the server will call the G_info()
function using ctypes
msgr.info("This is an info message")

# Send a verbose message, the server will call the G_verbose() function
using ctypes
msgr.verbose("This is a verbose message")

# Send a warning message, the server will call the G_warning() function
using ctypes
msgr.warning("This is the last warning")

# Send an error message, the server will call the G_error() function using
ctypes
msgr.error("This is an error message")

# Send a percentage message, the server will call the G_percent() function
using ctypes
msgr.percent(1, 1, 1) # 100%
}}}

The PyGRASS implementation could be able to respawn the server process in
case a G_fatal_error() occurred while using the ctypes interface. The
server process will be shut down if the Python object gets deleted. The
communication between server and client functions should be as fast as
possible, hence the client simply sends out a message and does not wait
for a server response.

As Vaclav pointed out: such an interface would be very useful for Python
libraries, modules and the GUI.

This kind of client server approach where the server calls ctype GRASS
functions can also be useful in other application: for example the GUI can
send vector editing messages (ctypes objects?) to a GRASS vector edit
server process, that can be respawned in case of a fatal error. Hence the
GUI will not crash when an error occurs.

Sounds like a kind of remote procedure call interface to the functions of
the GRASS C-libraries.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

#2127: Python implementation of g.message
-------------------------+--------------------------------------------------
Reporter: huhabla | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Keywords: | Platform: Unspecified
      Cpu: All |
-------------------------+--------------------------------------------------

Comment(by huhabla):

I have implementation GRASS messenger interface prototype. The file is
attached in the ticket. It uses the Python multiprocessing interface. The
IPC is handled via pipes. Here a usage example:

{{{
     msgr = Messenger()
     msgr.message("message")
     msgr.verbose("verbose message")
     msgr.important("important message")
     msgr.test_fatal_error()
     msgr.percent(1, 1, 1)
     msgr.debug(0, "debug 0")
     msgr.warning("Ohh")
     msgr.debug(1, "debug 1")
     msgr.error("Ohh no")
     msgr.stop()
     # This should result in:
     """
     message
     important message
     ERROR: this is a fatal error
     WARNING: Needed to restart the messenger server
      100%
     D0/0: debug 0
     WARNING: Ohh
     ERROR: Ohh no
     """
     # Test of the percentage creation
     msgr = Messenger()
     num = 100000
     for i in range(num):
         msgr.percent(i, num, 10)
     msgr.stop()
}}}

What are you thinking, any improvement suggestions, enhancement requests?
:slight_smile:

I would like to put the attached file {{{__init__.py}}} into a new pygrass
directory "lib/python/pygrass/messenger", if there are no objections
against it.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

On Monday 11 Nov 2013 00:46:20 GRASS GIS wrote:

I would like to put the attached file {{{__init__.py}}} into a new
pygrass directory "lib/python/pygrass/messenger", if there are no
objections against it.

Personally I have no objections! :slight_smile:
Well done.

Pietro

#2127: Python implementation of g.message
--------------------------+-------------------------------------------------
  Reporter: huhabla | Owner: grass-dev@…
      Type: enhancement | Status: closed
  Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Resolution: fixed | Keywords:
  Platform: Unspecified | Cpu: All
--------------------------+-------------------------------------------------
Changes (by huhabla):

  * status: new => closed
  * resolution: => fixed

Comment:

The fast and exit-safe interface to GRASS C-library message functions is
now available in trunk revision r58201.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

#2127: Python implementation of g.message
--------------------------+-------------------------------------------------
  Reporter: huhabla | Owner: grass-dev@…
      Type: enhancement | Status: closed
  Priority: normal | Milestone: 7.0.0
Component: Python | Version: svn-trunk
Resolution: fixed | Keywords:
  Platform: Unspecified | Cpu: All
--------------------------+-------------------------------------------------

Comment(by glynn):

Replying to [comment:2 wenzeslaus]:

> It seems to me that better option is to use `G_message()` etc through
`ctypes`.

Using ctypes for this is overkill.

Python's standard library intentionally doesn't use ctypes, so that sites
can remove it if they so wish (it's considered a risk factor).

grass.script is supposed to be a support library for scripts, not a Python
wrapper around the GRASS libraries.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2127#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>