Dear GRASS Developers,
I am pleased to announce that the pygrass library, developed during
the Google Summer of Code 2012, is now on GRASS trunk and need to be tested.
Before to continue the announcement, if you prefer the html version of this
message please see: https://gist.github.com/gists/3864411
The pygrass library has been developed take into account three main aspects,
in the order:
1. *simplicity*, GIS it's an already complex subject, and we want to make
the API as simple/natural/intuitive as possible but at the same time
trying to respect the GRASS work-flow and nomenclature to make it easier
to find information on the GRASS developer manual;
2. *flexibility* even if pygrass want to provide an high level map
interaction, internally used some classes that work with the
GRASS API at very low level, such as Segment, Bbox, Ilist classes.
Every time that a C struct has been used by a class, to leave the user
free to implement their own logic, the ctypes pointer of the C strct
it's available under the attribute name that start with "c_*";
3. *performance* pygrass makes heavy use of the C GRASS functions
every time that this is possible.
The pygrass library is divided in three main parts: Raster, Vector, and Modules.
Below some small samples, of an interactive python shell, are reported to better
understand how users and developers could interact with the *pygrass* library.
All the following examples use the maps present in the North Carolina mapset.
Raster
All the raster classes read and write meta-data like: categories and history.
The *pygrass* interface for raster maps is divided in 4 classes that represent
different ways to interact with rasters. In order to give greater freedom of
implementation to users and developers, each class uses a different C API,
providing the tools to fit different needs.
The **RasterRow** class reads the content of the raster row by row and writes it
in a sequential mode: row after row. Read and write the same map at the same
time is not supported by the RasterRow class.
The **RasterRowIO** class implements a row cache that allows users to read and
re-read raster rows randomly.
The **RasterSegment** class divides the map into several tiles, each tile is
saved into a file.With this class it is possible to read and write the pixel
value randomly at the same time in the same map.
The **RasterNumpy** class inherits from a *numpy.memmap* class and allows
users to interact with the map as numpy matrix.
All the Raster classes shared common methods to open, read and get
raster information.
Do simple operation with the *RasterRow* class: ::
>>> from grass.pygrass.raster import RasterRow
>>> elev = RasterRow('elevation')
>>> elev.exist()
True
>>> elev.name # get the raster name
'elevation'
>>> elev.mapset
'PERMANENT'
Now open the raster in a read mode, get raster information and read the rows: ::
>>> elev.open('r') # open in read mode
>>> elev.is_open()
True
>>> elev.range
(55.578792572021484, 156.32986450195312)
>>> for row in elev[:5]: # show the first 5 rows
... print(row[:3]) # show the first 3 columns of each row
...
[ 141.99613953 141.27848816 141.37904358]
[ 142.90461731 142.39450073 142.68611145]
[ 143.81854248 143.54707336 143.83972168]
[ 144.56524658 144.58493042 144.86477661]
[ 144.99488831 145.22894287 145.57142639]
>>> for row in elev[:5]:
... print(row[:3] < 144)
...
[1 1 1]
[1 1 1]
[1 1 1]
[0 0 0]
[0 0 0]
Open a new raster map, save, rename and remove: ::
>>> new = RasterRow('new')
>>> new.exist()
False
>>> new.open('w', 'CELL') # open a new CELL map in write mode
>>> for row in elev:
... new.put_row(row < 144) # write the boolean row
...
>>> new.close()
>>> new.mapset
'user1'
>>> new.name
'new'
>>> new.name = 'new_name' # rename the map changing the attribute value
>>> new.name
'new_name'
>>> new.exist()
True
>>> new.remove() # remove the map
>>> new.exist()
False
>>> elev.close()
Almost the same operations are available for the other Raster classes,
please see the documentation:
http://www.ing.unitn.it/~zambelli/projects/pygrass/raster.html
Vector
The *pygrass* interface defines two classes to work with vector maps, the first
class **Vector** loads a vector map without the GRASS topology, while the
**VectorTopo** class supports the topology. The *pygrass* library implements the
geometry features: **Point**, **Line**, **Boundary**, **Area**, **Isle**, each
class has its method to return vector information or to compute its properties
such as the length of a line, the distance between two points, or between a
point and a line, etc.
Here just small samples to better understand how it is now possible to interact
with the GRASS vectors and features are reported: ::
>>> from grass.pygrass.vector import VectorTopo
>>> municip = VectorTopo('boundary_municp_sqlite')
>>> municip.open()
>>> municip.number_of("areas")
3579
>>> municip.number_of("islands")
2629
>>> municip.number_of('pizzas') # suggest the right vector type if wrong
Traceback (most recent call last):
...
ValueError: vtype not supported, use one of: 'areas', 'dblinks', 'faces',
'holes', 'islands', 'kernels', 'line_points', 'lines', 'nodes',
'updated_lines',
'updated_nodes', 'volumes'
Suppose that we want to select all and only the areas that have a surface bigger
than 10000 m2: ::
>>> big = [area for area in municip.viter('areas')
... if area.alive() and area.area() >= 10000]
Then it is possible to sort the areas, with: ::
>>> from operator import methodcaller as method
>>> big.sort(key = method('area'), reverse = True) # sort the list
>>> for area in big[:3]:
... print area, area.area()
...
Area(3102) 697521857.848
Area(2682) 320224369.66
Area(2552) 298356117.948
The *pygrass* library implements classes to access the Table attributes, to
manages the database connection, to make queries, to add and remove columns.
Read and write the **Link** connection with the database of a vector map: ::
>>> municip.dblinks
DBlinks([[Link(1, boundary_municp, sqlite)]])
>>> link = municip.dblinks[1]
>>> link.number
1
>>> link.name
'boundary_municp'
>>> link.table_name
'boundary_municp_sqlite'
>>> link.driver
'sqlite'
>>> link.database[-30:]
'north_carolina/user1/sqlite.db'
>>> link.key
'cat'
>>> link.name = 'boundary'
>>> link.driver = 'pg'
>>> link.database = 'host=localhost,dbname=grassdb'
>>> link.key = 'gid'
From the Link object it is possible to instantiate a **Table** object where the
user could make simple query with the **Filters** object: ::
>>> table = link.table()
>>> table.filters.select('cat', 'COUNTY',
'AREA','PERIMETER').order_by('AREA').limit(3)
Filters('SELECT cat, COUNTY, AREA, PERIMETER FROM
boundary_municp_sqlite ORDER BY AREA LIMIT 3;')
>>> cur = table.execute()
>>> for row in cur.fetchall():
... print repr(row)
... # cat, COUNTY, AREA, PERIMETER
(1, u'SURRY', 0.0, 1415.331)
(2, u'SURRY', 0.0, 48286.011)
(3, u'CASWELL', 0.0, 5750.087)
In this way it is possible to work with table properties like name/rename the
table and add/remove/rename/cast the columns of the table: ::
>>> table.name
'boundary_municp_sqlite'
>>> table.columns
Columns([(u'cat', u'integer'), ..., (u'ACRES', u'double precision')])
>>> table.columns.names()
[u'cat', u'OBJECTID', u'AREA', u'PERIMETER', ..., u'ACRES']
>>> table.columns.types()
[u'integer', u'integer', u'double precision', ..., u'double precision']
>>> table.columns.add('n_pizza', 'int4')
>>> table.columns.names()[-1]
u'n_pizzas'
>>> table.columns.rename(u'n_pizzas', u'n_pizzas_per_person')
>>> table.columns.names()[-1]
u'n_pizzas_per_person'
>>> table.columns.cast(u'n_pizzas_per_person', 'float8')
>>> table.columns.items()[-1]
(u'n_pizzas_per_person', u'float8')
>>> table.columns.drop(u'n_pizzas_per_person')
For more examples with the Vector class, please see the documentation:
http://www.ing.unitn.it/~zambelli/projects/pygrass/vector.html
Modules
The *pygrass* **Module** class is compatible with the *grass.run_command*
syntax. ::
>>> from grass.pygrass.modules import Module
>>> slope_aspect = Module("r.slope.aspect", elevation='elevation',
... slope='slp', aspect='asp',
... format='percent', overwrite=True)
But it is possible to create a run able module object, change some attributes
and run later: ::
>>> slope_aspect = Module("r.slope.aspect", elevation='elev',
... slope='slp', aspect='asp',
... format='percent', overwrite=True, run_=False)
>>> slope_aspect.inputs['elevation']
Parameter <elev> (required:yes, type:raster, multiple:no)
>>> slope_aspect.inputs["elevation"].value = "elevation"
>>> slope_aspect.inputs["format"]
Parameter <format> (required:no, type:string, multiple:no)
>>> print slope_aspect.inputs["format"].__doc__ # get help for
the input parameter
format: 'degrees', optional, string
Format for reporting the slope
Values: 'degrees', 'percent'
>>> slope_aspect.inputs["format"].value = 'percents' # manage and
check the errors
Traceback (most recent call last):
...
ValueError: The Parameter <format>, must be one of: ['degrees', 'percent']
>>> slope_aspect.inputs["format"].value = 'percent'
>>> slope_aspect.run()
Or it is possible to initialize a module, and give the parameters later, like a
python function: ::
>>> slope_aspect = Module("r.slope.aspect")
>>> slope_aspect(elevation='elevation', slope='slp', aspect='asp',
... format='percent', overwrite=True)
Moreover the *pygrass* Module allows the user to run GRASS modules in a
different process and to manage (wait/kill/terminate) the process, and to
manage/record the output of the standard error and the standard output, in a way
that is easier than using the actual Python API of GRASS. ::
>>> slope_aspect = Module('r.slope.aspect')
>>> slope_aspect(elevation='elevation', slope='slp', aspect='asp',
... overwrite=True, finish_=False)
>>> slope_aspect.popen.wait() # *.kill(), *.terminate()
0
>>> out, err = slope_aspect.popen.communicate()
>>> print err
100%
Aspect raster map <asp> complete
Slope raster map <slp> complete
For more examples with the Module class, please see the documentation:
http://www.ing.unitn.it/~zambelli/projects/pygrass/modules.html