[GRASS-dev] New wiki page summarising GRASS APIs

Hi,

In preparation of a talk at the geospatial devroom at the FOSSDEM this weekend, I've elaborated a wiki page on the current GRASS GIS APIs:

http://grasswiki.osgeo.org/wiki/GRASS_GIS_APIs

I still need to complete the part on the C-API, but any feedback is welcome, especially if I put any nonsense in there. It really is only supposed to be a brief synthetic overview of each API to give people an idea.

Moritz

Hi Moritz,

On Fri, Jan 30, 2015 at 5:02 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

Hi,

In preparation of a talk at the geospatial devroom at the FOSSDEM this
weekend, I've elaborated a wiki page on the current GRASS GIS APIs:

http://grasswiki.osgeo.org/wiki/GRASS_GIS_APIs

I still need to complete the part on the C-API,

Quick comment just on the C part:
I think that using system calls in C is not good to show, shouldn't is
be rather these functions?

http://grass.osgeo.org/programming7/get__window_8c.html

Best from Brussels,
Markus

Hi Moritz,

On Fri, Jan 30, 2015 at 5:02 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

In preparation of a talk at the geospatial devroom at the FOSSDEM this
weekend, I've elaborated a wiki page on the current GRASS GIS APIs:

http://grasswiki.osgeo.org/wiki/GRASS_GIS_APIs

Thank you for the work that you have done! Some brief notes:

Why os.system?

{{{
import os
os.system('g.region rast=elevation')
}}}

and not subprocess:

{{{
import subprocess
subprocess.call('g.region rast=elevation', shell=True)

# or

subprocess.Popen('g.region rast=elevation', shell=True)
}}}

Concerning the pygrass Module API, may be we can use the shortcut version:

{{{

from grass.pygrass.modules.shortcuts import general as g
gregion = g.region(flags='p') # return a Module class instance

projection: 99 (Lambert Conformal Conic)
zone: 0
datum: nad83
ellipsoid: a=6378137 es=0.006694380022900787
north: 318500
south: -16000
west: 124000
east: 963000
nsres: 500
ewres: 500
rows: 669
cols: 1678
cells: 1122582

gregion

Module('g.region')

gregion.name

'g.region'

gregion.description

'Manages the boundary definitions for the geographic region.'

gregion.flags.p = False # change previous flag to False
gregion.flags.g =True # set the g flag to True
gregion.run()

n=318500
s=-16000
w=124000
e=963000
nsres=500
ewres=500
rows=669
cols=1678
cells=1122582
Module('g.region')

gregion.inputs.raster = 'elevation'
gregion.run()

n=318500
s=-16000
w=124000
e=963000
nsres=500
ewres=500
rows=669
cols=1678
cells=1122582
}}}

In the Census example, we should add census.close() to close the
vector map after the use, or use the with statement:

{{{

with VectorTopo('census', mode='r') as census:

... print('numb. areas:', census.number_of('areas'))
... for area in census.viter('areas'):
... if area.area()>4000000:
... print(area.id, area.area())
}}}

All the best

Pietro

On 30/01/15 23:47, Markus Neteler wrote:

Hi Moritz,

On Fri, Jan 30, 2015 at 5:02 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

Hi,

In preparation of a talk at the geospatial devroom at the FOSSDEM this
weekend, I've elaborated a wiki page on the current GRASS GIS APIs:

http://grasswiki.osgeo.org/wiki/GRASS_GIS_APIs

I still need to complete the part on the C-API,

Quick comment just on the C part:
I think that using system calls in C is not good to show, shouldn't is
be rather these functions?

http://grass.osgeo.org/programming7/get__window_8c.html

Well that's when you use the GRASS C-API. The idea in the first part is to show that you can actually consider GRASS as an API in itself with the modules playing a role comparable to 'functions', and that you can call these functions from other programming languages via system calls.

Moritz

Hi Pietro,

Thanks for the feedback !

On 31/01/15 11:06, Pietro wrote:

Hi Moritz,

On Fri, Jan 30, 2015 at 5:02 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

In preparation of a talk at the geospatial devroom at the FOSSDEM this
weekend, I've elaborated a wiki page on the current GRASS GIS APIs:

http://grasswiki.osgeo.org/wiki/GRASS_GIS_APIs

Thank you for the work that you have done! Some brief notes:

Why os.system?

[...]

and not subprocess:

I thought that os.system was easier to understand, but you're right, let's immediately promote the use of subprocess.

Concerning the pygrass Module API, may be we can use the shortcut version:

I've added this as an additional example.

In the Census example, we should add census.close() to close the

Ah, yes, thank you. I've actually now replaced this example with one using schools and adding a new school to the map.

Moritz

Pietro wrote:

{{{
import os
os.system('g.region rast=elevation')
}}}

and not subprocess:

{{{
import subprocess
subprocess.call('g.region rast=elevation', shell=True)

# or

subprocess.Popen('g.region rast=elevation', shell=True)
}}}

Concerning the pygrass Module API, may be we can use the shortcut version:

Using subprocess.call() with shell=True is no better than using
os.system(). Both should be avoided at all costs.

The grass.script module provides a number of convenience functions
which use grass.script.make_command() to generate the command's
argument list from the function's argument list. Also, they use a
version of subprocess.Popen() which has been wrapped to deal with some
of Windows' idiosyncrasies.

--
Glynn Clements <glynn@gclements.plus.com>

On 01/02/15 12:29, Glynn Clements wrote:

Pietro wrote:

{{{
import os
os.system('g.region rast=elevation')
}}}

and not subprocess:

{{{
import subprocess
subprocess.call('g.region rast=elevation', shell=True)

# or

subprocess.Popen('g.region rast=elevation', shell=True)
}}}

Concerning the pygrass Module API, may be we can use the shortcut version:

Using subprocess.call() with shell=True is no better than using
os.system(). Both should be avoided at all costs.

The grass.script module provides a number of convenience functions
which use grass.script.make_command() to generate the command's
argument list from the function's argument list. Also, they use a
version of subprocess.Popen() which has been wrapped to deal with some
of Windows' idiosyncrasies.

Just to be clear: the example on this wiki page shows just that it is possible to call any GRASS module from any language that allows system calls. It then goes on to introduce the scripting API to show how that eases things on the script writer since she doesn't have to deal with any idiosyncrasies.

But if this causes too much opposition I can take it out or at least put a big warning of likes of "Don't use this ! Example code only."

Moritz

On Sun, Feb 1, 2015 at 2:26 PM, Moritz Lennert <mlennert@club.worldonline.be

wrote:

On 01/02/15 12:29, Glynn Clements wrote:

Pietro wrote:

{{{

import os
os.system('g.region rast=elevation')
}}}

and not subprocess:

{{{
import subprocess
subprocess.call('g.region rast=elevation', shell=True)

# or

subprocess.Popen('g.region rast=elevation', shell=True)
}}}

Concerning the pygrass Module API, may be we can use the shortcut
version:

Using subprocess.call() with shell=True is no better than using
os.system(). Both should be avoided at all costs.

The grass.script module provides a number of convenience functions
which use grass.script.make_command() to generate the command's
argument list from the function's argument list. Also, they use a
version of subprocess.Popen() which has been wrapped to deal with some
of Windows' idiosyncrasies.

Just to be clear: the example on this wiki page shows just that it is
possible to call any GRASS module from any language that allows system
calls. It then goes on to introduce the scripting API to show how that
eases things on the script writer since she doesn't have to deal with any
idiosyncrasies.

But if this causes too much opposition I can take it out or at least put a
big warning of likes of "Don't use this ! Example code only."

I suggest to use Bash example instead of Python one because this is how it
is actually (also) used.

If some warning is needed, it probably applies to all, Bash, C system calls
and Perl too.

Do you consider a section for 3rd party APIs? Particularly I'm asking about
spgrass in R (the same philosophy as grass.script for Python) and QGIS
Processing (wrapper, limited functionality?, different philosophy).

Moritz

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On 01/02/15 21:22, Vaclav Petras wrote:

On Sun, Feb 1, 2015 at 2:26 PM, Moritz Lennert
<mlennert@club.worldonline.be <mailto:mlennert@club.worldonline.be>> wrote:

    On 01/02/15 12:29, Glynn Clements wrote:

        Pietro wrote:

            {{{
            import os
            os.system('g.region rast=elevation')
            }}}

            and not subprocess:

            {{{
            import subprocess
            subprocess.call('g.region rast=elevation', shell=True)

            # or

            subprocess.Popen('g.region rast=elevation', shell=True)
            }}}

            Concerning the pygrass Module API, may be we can use the
            shortcut version:

        Using subprocess.call() with shell=True is no better than using
        os.system(). Both should be avoided at all costs.

        The grass.script module provides a number of convenience functions
        which use grass.script.make_command() to generate the command's
        argument list from the function's argument list. Also, they use a
        version of subprocess.Popen() which has been wrapped to deal
        with some
        of Windows' idiosyncrasies.

    Just to be clear: the example on this wiki page shows just that it
    is possible to call any GRASS module from any language that allows
    system calls. It then goes on to introduce the scripting API to show
    how that eases things on the script writer since she doesn't have to
    deal with any idiosyncrasies.

    But if this causes too much opposition I can take it out or at least
    put a big warning of likes of "Don't use this ! Example code only."

I suggest to use Bash example instead of Python one because this is how
it is actually (also) used.

Well, there's no real syntax for a GRASS module call from Bash, you just do it, so the first example on the page can actually be considered a Bash example...

If some warning is needed, it probably applies to all, Bash, C system
calls and Perl too.

Maybe I should just erase all examples, just leaving the information that you can call GRASS modules via system calls if you know what you are doing.

For now, I've added a warning that you should know what you are doing if you want to use these calls.

Do you consider a section for 3rd party APIs? Particularly I'm asking
about spgrass in R (the same philosophy as grass.script for Python) and
QGIS Processing (wrapper, limited functionality?, different philosophy).

Good idea. I just added a section linking to these two. I don't have much time these days to add anything else on them. So anyone who does, feel free.

Moritz

Moritz Lennert wrote:

For now, I've added a warning that you should know what you are doing if
you want to use these calls.

Why not just use suitable examples, e.g. using subprocess.Popen()
(without shell=True), with the caveat that on Windows it won't work
for scripts, only compiled executables.

For C, it's somewhat harder, as there isn't a mechanism for executing
commands which is standard, simple, reliable and portable. GRASS'
G_spawn* functions mostly have the last 3 (other than not working for
scripts on Windows). fork()+exec*() isn't portable or particularly
simple, but is at least standard on POSIX systems. Windows doesn't
really have anything in that regard (you have to do the quoting
yourself, and the rules differ for executables and scripts).

--
Glynn Clements <glynn@gclements.plus.com>

On 02/02/15 13:48, Glynn Clements wrote:

Moritz Lennert wrote:

For now, I've added a warning that you should know what you are doing if
you want to use these calls.

Why not just use suitable examples, e.g. using subprocess.Popen()
(without shell=True), with the caveat that on Windows it won't work
for scripts, only compiled executables.

But what exactly is the problem with using subprocess.call with shell=True ? Security issues ? Difficulties in calling shell scripts ?

AFAICT, it's just a wrapper around Popen.wait(), or ?

I've now replaced this with:

  subprocess.Popen(['r.watershed', 'elevation=elevation', 'threshold=10000', 'stream=raster_streams'])

or would it be better with .wait():

  import subprocess
  subprocess.Popen(['r.watershed', 'elevation=elevation', 'threshold=10000', 'stream=raster_streams']).wait()

?

I guess this depends on what the programmer wants...

For C, it's somewhat harder, as there isn't a mechanism for executing
commands which is standard, simple, reliable and portable. GRASS'
G_spawn* functions mostly have the last 3 (other than not working for
scripts on Windows). fork()+exec*() isn't portable or particularly
simple, but is at least standard on POSIX systems. Windows doesn't
really have anything in that regard (you have to do the quoting
yourself, and the rules differ for executables and scripts).

I think that's why I don't think that we should fret too much about the specfic examples here. The main argument in the whole section is that it is possible to consider GRASS modules as an "API" in its own right of which you can call the "functions" (aka modules) from any programming language via system calls. The exact syntax of these calls are beyond the scope of the document.

And even though within GRASS Python it should always be Popen(), maybe there are situations out there where calling a module via call() is justifiable...

Moritz

Moritz Lennert wrote:

>> For now, I've added a warning that you should know what you are doing if
>> you want to use these calls.
>
> Why not just use suitable examples, e.g. using subprocess.Popen()
> (without shell=True), with the caveat that on Windows it won't work
> for scripts, only compiled executables.

But what exactly is the problem with using subprocess.call with
shell=True ?

It requires you to construct the correct string. If the command string
is a single literal string with all argument values consisting of
alphanumerics plus "safe" punctuation (i.e. characters which have no
meaning to any shell), that's simple enough.

But if any of the arguments are variable and *could* contain
characters which are meaningful to a shell, constructing the correct
string (the one which results in the called program's argv having
the intended values) isn't straightforward (and the exact rules vary
between platforms).

It's much simpler (and safer) to just remove the shell from equation
altogether and pass the individual arguments directly, rather than
trying to construct a string which, when deconstructed by the shell,
will produce the correct result.

Security issues ? Difficulties in calling shell scripts ?

Security would be a problem, but if you're dealing with potentially
malicious input, it's the least of your problems compared to the rest
of GRASS. Calling scripts on Windows is a different problem (setting
shell=True "solves" one problem while introducing more).

AFAICT, it's just a wrapper around Popen.wait(), or ?

subprocess.call() is a wrapper() around Popen() and Popen.wait().
Exactly the same issue applies to Popen() itself (it applies to
.call() *because* it applies to Popen() itself).

I've now replaced this with:

  subprocess.Popen(['r.watershed', 'elevation=elevation',
'threshold=10000', 'stream=raster_streams'])

or would it be better with .wait():

  import subprocess
  subprocess.Popen(['r.watershed', 'elevation=elevation',
'threshold=10000', 'stream=raster_streams']).wait()

subprocess.call() is fine. It's using shell=True which is the problem.

For simple examples, I'd suggest using subprocess.call, and referring
them to the python.org documentation for the (2.x) subprocess module
for anything else.

And even though within GRASS Python it should always be Popen(), maybe
there are situations out there where calling a module via call() is
justifiable...

Well, depending upon what you mean by "GRASS Python", it should
arguably be grass.script.run_command() or similar.

Even if the wiki page isn't the right place to introduce that, it's
probably worth mentioning that it exists before users end up
recreating the wheel (e.g. if they care about being able to run both
executables and script and doing so on both Unix/MacOSX and Windows,
they'll end up having to learn the same lessons we have).

--
Glynn Clements <glynn@gclements.plus.com>

On 02/02/15 19:07, Glynn Clements wrote:

Moritz Lennert wrote:

For now, I've added a warning that you should know what you are doing if
you want to use these calls.

Why not just use suitable examples, e.g. using subprocess.Popen()
(without shell=True), with the caveat that on Windows it won't work
for scripts, only compiled executables.

But what exactly is the problem with using subprocess.call with
shell=True ?

It requires you to construct the correct string. If the command string
is a single literal string with all argument values consisting of
alphanumerics plus "safe" punctuation (i.e. characters which have no
meaning to any shell), that's simple enough.

But if any of the arguments are variable and *could* contain
characters which are meaningful to a shell, constructing the correct
string (the one which results in the called program's argv having
the intended values) isn't straightforward (and the exact rules vary
between platforms).

It's much simpler (and safer) to just remove the shell from equation
altogether and pass the individual arguments directly, rather than
trying to construct a string which, when deconstructed by the shell,
will produce the correct result.

Security issues ? Difficulties in calling shell scripts ?

Security would be a problem, but if you're dealing with potentially
malicious input, it's the least of your problems compared to the rest
of GRASS. Calling scripts on Windows is a different problem (setting
shell=True "solves" one problem while introducing more).

AFAICT, it's just a wrapper around Popen.wait(), or ?

subprocess.call() is a wrapper() around Popen() and Popen.wait().
Exactly the same issue applies to Popen() itself (it applies to
.call() *because* it applies to Popen() itself).

Ok, thanks for all the explanations.

I've now replaced this with:

   subprocess.Popen(['r.watershed', 'elevation=elevation',
'threshold=10000', 'stream=raster_streams'])

or would it be better with .wait():

   import subprocess
   subprocess.Popen(['r.watershed', 'elevation=elevation',
'threshold=10000', 'stream=raster_streams']).wait()

subprocess.call() is fine. It's using shell=True which is the problem.

For simple examples, I'd suggest using subprocess.call, and referring
them to the python.org documentation for the (2.x) subprocess module
for anything else.

Ok, I put .call back in.

And even though within GRASS Python it should always be Popen(), maybe
there are situations out there where calling a module via call() is
justifiable...

Well, depending upon what you mean by "GRASS Python", it should
arguably be grass.script.run_command() or similar.

Even if the wiki page isn't the right place to introduce that, it's
probably worth mentioning that it exists before users end up
recreating the wheel (e.g. if they care about being able to run both
executables and script and doing so on both Unix/MacOSX and Windows,
they'll end up having to learn the same lessons we have).

This is what the warning before the system call examples says:

"Warning: In many cases, system calls such as these demand that you really know what you are doing. If you want to program in Python, you are encouraged to rather use the existing Python APIs explained below instead of such system calls."

And then it goes on with:

"These system calls are easy to handle when no output is expected from the GRASS module. When output needs to be collected then the programming task already becomes a little harder unless you know what you are doing. Equally, they can be tricky when introducing variable arguments and special characters. It is for this reason that the Python GRASS libraries where developed that are explained in the next section."

I hope this is explicit enough.

Again, the aim of this page is not to give a programming course, but rather to show that you can program with and for GRASS in different way, from single system calls to GRASS modules to full-fledged programming with the C-API.

Moritz

Moritz Lennert wrote:

This is what the warning before the system call examples says:

"Warning: In many cases, system calls such as these demand that you
really know what you are doing. If you want to program in Python, you
are encouraged to rather use the existing Python APIs explained below
instead of such system calls."

And then it goes on with:

"These system calls are easy to handle when no output is expected from
the GRASS module. When output needs to be collected then the programming
task already becomes a little harder unless you know what you are doing.
Equally, they can be tricky when introducing variable arguments and
special characters. It is for this reason that the Python GRASS
libraries where developed that are explained in the next section."

I hope this is explicit enough.

I still think that the C version should either be removed or use e.g.
posix_spawn() or G_spawn*() instead of system() (note: the Windows
spawn*() functions aren't safe; they merely concatenate their
arguments with spaces inbetween, which is no better than system()).

The perl version should be changed to use the list form of system()
rather than the string form, e.g.

  @args = ("r.watershed", "elevation=elevation", "threshold=10000", "stream=raster_streams");
  system(@args);

Under no circumstances should a command be passed as a string, even in
examples. Especially in examples. Telling people not to use that
approach won't work, particularly when it's the only approach
demonstrated.

--
Glynn Clements <glynn@gclements.plus.com>

On 03/02/15 16:35, Glynn Clements wrote:

Moritz Lennert wrote:

This is what the warning before the system call examples says:

"Warning: In many cases, system calls such as these demand that you
really know what you are doing. If you want to program in Python, you
are encouraged to rather use the existing Python APIs explained below
instead of such system calls."

And then it goes on with:

"These system calls are easy to handle when no output is expected from
the GRASS module. When output needs to be collected then the programming
task already becomes a little harder unless you know what you are doing.
Equally, they can be tricky when introducing variable arguments and
special characters. It is for this reason that the Python GRASS
libraries where developed that are explained in the next section."

I hope this is explicit enough.

I still think that the C version should either be removed or use e.g.
posix_spawn() or G_spawn*() instead of system() (note: the Windows
spawn*() functions aren't safe; they merely concatenate their
arguments with spaces inbetween, which is no better than system()).

The perl version should be changed to use the list form of system()
rather than the string form, e.g.

  @args = ("r.watershed", "elevation=elevation", "threshold=10000", "stream=raster_streams");
  system(@args);

Under no circumstances should a command be passed as a string, even in
examples. Especially in examples. Telling people not to use that
approach won't work, particularly when it's the only approach
demonstrated.

Ok, I've removed the C example and replaced the perl example with your version.

Thanks !

Moritz