Hi Wiley,
On Wed, Oct 16, 2013 at 6:59 AM, Wiley Bogren <wiley.bogren@gmail.com> wrote:
hi GRASS community!
I'm amazed how well this software handles vector operations - especially the
overlay operation seems unparalleled in open source software. Thank you
very much to everyone who has been involved in the development process!
I would like to script a workflow where I apply the same set of operations
on a few hundred sets of shapefiles, consisting of v.in.ogr, several sets of
v.overlay, some database operations and v.out.ogr. The shapefiles are
20-30MB apiece, containing many polygons, each with many vertices.
Is there a difference in speed or processor efficiency between the different
scripting approaches? By which I mean python vs bash shell, and within the
GRASS environment vs calling the functions from outside the environment
(like via python grass.script).
Sorry for the late response...
I've imported several files, using a multiprocessing approach in python, with:
{{{
from multiprocessing import Queue, Process, cpu_count
from os.path import split
from subprocess import Popen
from grass.pygrass.functions import findfiles
def spawn(func):
def fun(q_in, q_out):
while True:
path, cmdstr = q_in.get()
if path is None:
break
q_out.put(func(path, cmdstr))
return fun
def mltp_importer(dirpath, match, cmdstr, func, nprocs=cpu_count()):
q_in = Queue(1)
q_out = Queue()
procs = [Process(target=spawn(func), args=(q_in, q_out))
for _ in range(nprocs)]
for proc in procs:
proc.daemon = True
proc.start()
# set the parameters
sent = [q_in.put((path, cmdstr)) for path in findfiles(dirpath, match)]
# set the end of the cycle
[q_in.put((None, None)) for proc in procs]
[proc.join() for proc in procs]
return [q_out.get() for _ in range(len(sent))]
def importer(path, cmdstr):
name = split(path)[-1][:-4]
popen = Popen(cmdstr.format(path=path, name=name), shell=True)
popen.wait()
return path, name, False if popen.returncode else True
DIR = '/data/gis/data/Aviemore/shp'
CMD = 'v.in.ogr dsn={path} layer={name} output={name} -o --o'
processed = mltp_importer(DIR, '*.shp', CMD, importer)
# check for errors
errors = [p for p in processed if not p[2]]
if errors:
# do something
pass
}}}
I hope that this could help you...
the code is freely inspired by: http://stackoverflow.com/a/16071616