Hi Pietro,
On 31/10/13 00:34, Pietro Zambelli wrote:
Hi Moritz,
I'm writing some modules (in python) to basically do the same thing.
Great ! Then I won't continue on that and rather wait for your stuff. Do you have code, yet (except for i.segment.hierarchical) ? Don't hesitate to publish early.
I think once the individual elements are there, it should be quite easy to cook up a little binding module which would allow to choose segmentation parameters, the variables to use for polygon characterization, the classification algorithm, etc and then launch the whole process.
I'm trying to apply a Object-based classification for a quite big area
(the region is more than 14 billions of cells).
At the moment I'm working with a smaller area with "only" ~1 billions of
cells, but it is still quite challenging.
14 billion _is_ quite ambitious
I guess we should focus on getting the functionality, first and then think about optimisation for size...
To speed-up the segmentation process I did the i.segment.hierarchical
module [0]. that split the region in several tiles, compute the segment
for each tile, patch all the tiles together and run a last time i
segment using the patched map as a seed.
Any reason other than preference for git over svn for not putting your module into grass-addons ?
for a region of 24k row for 48k cols it required less than two hour to
run and patch all the tiles, and more than 5 hours to run the "final"
i.segment over the patched map (using only 3 iterations!).
That's still only 7 hours for segmentation of a billion-cell size image. Not bad compared to other solutions out there...
From my experience I can say that the use "v.to.db" is terribly slow if
you want to apply to a vector map with more than 2.7 Millions of areas.
So I've develop a python function that compute the same values, but it
is much faster that the v.to.db module, and should be possible to split
the operation in several processes for further speed up... (It is still
under testing).
Does your python module load the values into an attribute table ? I would guess that that's the slow part in v.to.db. Generally, I think that's another field where optimization would be great (if possible): database interaction, notably writing to tables. IIUC, in v.to.db there is a seperate update operation for each feature. I imagine that there must be a faster way to do this...
On Wednesday 30 Oct 2013 21:04:22 Moritz Lennert wrote:
> - It uses the v.class.mlpy addon module for classification, so that
> needs to be installed. Kudos to Vaclav for that module ! It currently
> only uses the DLDA classifier. The mlpy library offers many more, and I
> think it should be quite easy to add them. Obviously, one could also
> simply export the attribute table of the segments and of the training
> areas to csv files and use R to do the classification.
I'm extended to use tree/k-NN/SVM Machine learning from MLPY [1] (I've
used also Parzen, but the results were not good enough) and to work also
with the scikit [2] classifiers.
You extended v.class.mlpy ? Is that code available somewhere ?
Scikit it seems to have a larger community and should be easier to
install than MLPY, and last but not least it seems generally faster [3].
I don't have any preferences on that. Colleagues here use R machine learning tools.
> - Many other variables could be calculated for the segments: other
> texture variables (possibly variables by segment, not as average of
> pixel-based variables, cf [1]), other shape variables (cf the new work
> of MarkusM on center lines and skeletons of polygons in v.voronoi), band
> indices, etc. It would be interesting to hear what most people find
useful.
I'm working to add also a C function to the GRASS library to compute the
barycentre and the a polar second moment of Area (or Moment of Inertia),
that return a number that it is independent from the orientation and
dimension.
Great ! I guess the more the merrier
See also [1]. Maybe its just a small additional step to add that at the same time ?
> - I do the step of digitizing training areas in the wxGUI digitizer
> using the attribute editing tool and filling in the 'class' attribute
> for those polygons I find representative. As already mentioned in
> previous discussions [2], I do think that it would be nice if we could
> have an attribute editing form that is independent of the vector
digitizer.
I use the i.gui.class to generate the training vector map, and then use
this map to select the training areas, and export the final results into
a file (at the moment only csv and npy formats are supported).
How do you do that ? Do you generate training points (or small areas) and then select the areas these points fall into ?
I thought it best to select training areas among the actual polygons coming out of i.segment.
Some days ago I've discussed with MarkusM, that may be I could do a GSoC
next year to modify the i.segment module to automatically split the
domain in tiles, run as a multiprocess, and then "patch" only the
segments that are on the border of the tiles, this solution should be
much faster than my actual solution[0].
Great idea !
Moreover we should consider to
skip to transform the segments into vector to extract the shape
parameters and extract shape and others parameters (mean, median,
skewness, std, etc.) directly as last step before to free the memory
from the segments structures, writing a csv/npy file.
I guess it is not absolutely necessary to go via vector. You could always leave the option to vectorize the segments, import the parameter file into a table and then link that table to the vector.
Moritz
[1] https://trac.osgeo.org/grass/ticket/2122