[GRASS-dev] Introduction and GSoC

Hi everyone,

I'm relatively new to the GRASS world, and thought I would send a
quick introduction. After spending some time in the "real" world as
an engineer, I have returned to university life and have a first goal
of a master's degree in computer science. My research currently
involves data mining in an agricultural setting, and the tools include
GRASS and R.

I'm trying to decide if I've come up the learning curve far enough to
contribute new code for GRASS as part of the Google Summer of Code. I
must decide soon, the deadline for student applications is approaching
fast :slight_smile:

I've seen the list of project ideas on the wiki, there is a lot there!
Would anyone want to suggest a few that could be a good place for a
new arrival to start, and/or would tie in well to data mining? I'm
happy to discuss on the list, or you can contact me at eric dot momsen
at gmail.

Thanks,
Eric

Hi Eric!

It might help us in trying to classify you if you could tell a bit about your programming experience. What languages are you comfortable with? I see that you are interested in data mining, which is unfortunately something I know very little about… Did you see any ideas which interested you? You can of course also come up with your own project. Is there for instance some data mining module missing from GRASS? That could be interesting, especially if you prepare a good solid proposal for it.

–Wolf

On Fri, Mar 23, 2012 at 20:14, Eric Momsen <eric.momsen@gmail.com> wrote:

Hi everyone,

I’m relatively new to the GRASS world, and thought I would send a
quick introduction. After spending some time in the “real” world as
an engineer, I have returned to university life and have a first goal
of a master’s degree in computer science. My research currently
involves data mining in an agricultural setting, and the tools include
GRASS and R.

I’m trying to decide if I’ve come up the learning curve far enough to
contribute new code for GRASS as part of the Google Summer of Code. I
must decide soon, the deadline for student applications is approaching
fast :slight_smile:

I’ve seen the list of project ideas on the wiki, there is a lot there!
Would anyone want to suggest a few that could be a good place for a
new arrival to start, and/or would tie in well to data mining? I’m
happy to discuss on the list, or you can contact me at eric dot momsen
at gmail.

Thanks,
Eric


grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On Fri, Mar 23, 2012 at 2:41 PM, Wolf Bergenheim
<wolf+grass@bergenheim.net> wrote:

Hi Eric!

It might help us in trying to classify you if you could tell a bit about
your programming experience. What languages are you comfortable with? I see

So far I have been using mostly java. I wrote a small implementation
for a pattern mining algorithm in R, and will be doing a little prolog
this semester. Years ago I helped set up a data transfer/validation
system ( Oracle database <-> visual basic <-> website), but those
skills are pretty rusty now. I certainly expect to learn more
languages.

that you are interested in data mining, which is unfortunately something I
know very little about... Did you see any ideas which interested you? You

The first project for animating raster/(and possibly vector) time
series data was interesting, as well as a number of the imagery
topics.

can of course also come up with your own project. Is there for instance some
data mining module missing from GRASS? That could be interesting, especially
if you prepare a good solid proposal for it.

I was actually hoping to get farther along with my own research to be
able to (selfishly) propose something directly helpful to my work. (I
do understand the goal is the code, but if I could meet both goals...)
So far I seem to be stuck on the pre-processing side of things. I
will touch base with my adviser again next week to see if we can come
up with something interesting to propose. If we do come up with
something, should I sketch the idea here first? Is someone available
to mentor?

-Eric

--Wolf

On Fri, Mar 23, 2012 at 20:14, Eric Momsen <eric.momsen@gmail.com> wrote:

Hi everyone,

I'm relatively new to the GRASS world, and thought I would send a
quick introduction. After spending some time in the "real" world as
an engineer, I have returned to university life and have a first goal
of a master's degree in computer science. My research currently
involves data mining in an agricultural setting, and the tools include
GRASS and R.

I'm trying to decide if I've come up the learning curve far enough to
contribute new code for GRASS as part of the Google Summer of Code. I
must decide soon, the deadline for student applications is approaching
fast :slight_smile:

I've seen the list of project ideas on the wiki, there is a lot there!
Would anyone want to suggest a few that could be a good place for a
new arrival to start, and/or would tie in well to data mining? I'm
happy to discuss on the list, or you can contact me at eric dot momsen
at gmail.

Thanks,
Eric
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On Fri, Mar 23, 2012 at 22:04, Eric Momsen <eric.momsen@gmail.com> wrote:

On Fri, Mar 23, 2012 at 2:41 PM, Wolf Bergenheim
<wolf+grass@bergenheim.net> wrote:

Hi Eric!

It might help us in trying to classify you if you could tell a bit about
your programming experience. What languages are you comfortable with? I see

So far I have been using mostly java. I wrote a small implementation
for a pattern mining algorithm in R, and will be doing a little prolog
this semester. Years ago I helped set up a data transfer/validation
system ( Oracle database ↔ visual basic ↔ website), but those
skills are pretty rusty now. I certainly expect to learn more
languages.

Hmm If you are most strong in Java I’d recommend looking at uDig (http://udig.refractions.net/confluence/display/HACK/Summer+of+Code), since GRASS is C and Python. It might be a better match.

that you are interested in data mining, which is unfortunately something I
know very little about… Did you see any ideas which interested you? You

The first project for animating raster/(and possibly vector) time
series data was interesting, as well as a number of the imagery
topics.

Cool, You should probably think about more than one topic and maybe prepare several applications.

can of course also come up with your own project. Is there for instance some
data mining module missing from GRASS? That could be interesting, especially
if you prepare a good solid proposal for it.

I was actually hoping to get farther along with my own research to be
able to (selfishly) propose something directly helpful to my work. (I
do understand the goal is the code, but if I could meet both goals…)
So far I seem to be stuck on the pre-processing side of things. I
will touch base with my adviser again next week to see if we can come
up with something interesting to propose. If we do come up with
something, should I sketch the idea here first? Is someone available
to mentor?

There might be someone willing to mentor. That remains to be seen. Would your adviser be interested in helping to co-mentor you, along with a GRASS mentor? It would definitely be a good idea to discuss your ideas here first, it will improve the chances of getting selected.

–Wolf

I am reading more about image classification (from
http://grass.osgeo.org/wiki/GRASS_SoC_Ideas ):

4. Implement image segmentation algorithms and tools
5. Implement region-based classification
6. Implement hierarchical classification tools (e.g. being able to
create a large class "forest", with subclasses of different types of
forests)

I see Hamish is interested in mentoring the parallelization portion of
that list. Are these other ideas orphans, or is someone available
that could discuss the background and needs of the community around
these ideas (and/or mentor...) Thanks!

(And uDig conversation continued below...)

On Fri, Mar 23, 2012 at 5:47 PM, Wolf Bergenheim
<wolf+grass@bergenheim.net> wrote:

On Fri, Mar 23, 2012 at 22:04, Eric Momsen <eric.momsen@gmail.com> wrote:

On Fri, Mar 23, 2012 at 2:41 PM, Wolf Bergenheim
<wolf+grass@bergenheim.net> wrote:
> Hi Eric!
>
> It might help us in trying to classify you if you could tell a bit about
> your programming experience. What languages are you comfortable with? I
> see

So far I have been using mostly java. I wrote a small implementation
for a pattern mining algorithm in R, and will be doing a little prolog
this semester. Years ago I helped set up a data transfer/validation
system ( Oracle database <-> visual basic <-> website), but those
skills are pretty rusty now. I certainly expect to learn more
languages.

Hmm If you are most strong in Java I'd recommend looking at uDig
(http://udig.refractions.net/confluence/display/HACK/Summer+of+Code), since
GRASS is C and Python. It might be a better match.

I'm happy to learn Python, it seems the software goal is more
important then the language that will be used.

My impression is that the goal is to provide an easy way for the user
to *manually* explore data from any available source. The
visualization and intuitive interface look great. But am I correct
that for doing statistics and number crunching, I should be focusing
on GRASS and R?

> that you are interested in data mining, which is unfortunately something
> I
> know very little about... Did you see any ideas which interested you?
> You

The first project for animating raster/(and possibly vector) time
series data was interesting, as well as a number of the imagery
topics.

Cool, You should probably think about more than one topic and maybe prepare
several applications.

> can of course also come up with your own project. Is there for instance
> some
> data mining module missing from GRASS? That could be interesting,
> especially
> if you prepare a good solid proposal for it.

I was actually hoping to get farther along with my own research to be
able to (selfishly) propose something directly helpful to my work. (I
do understand the goal is the code, but if I could meet both goals...)
So far I seem to be stuck on the pre-processing side of things. I
will touch base with my adviser again next week to see if we can come
up with something interesting to propose. If we do come up with
something, should I sketch the idea here first? Is someone available
to mentor?

There might be someone willing to mentor. That remains to be seen. Would
your adviser be interested in helping to co-mentor you, along with a GRASS
mentor? It would definitely be a good idea to discuss your ideas here first,
it will improve the chances of getting selected.

--Wolf

On 30/03/12 23:56, Eric Momsen wrote:

I am reading more about image classification (from
http://grass.osgeo.org/wiki/GRASS_SoC_Ideas ):

4. Implement image segmentation algorithms and tools
5. Implement region-based classification
6. Implement hierarchical classification tools (e.g. being able to
create a large class "forest", with subclasses of different types of
forests)

I see Hamish is interested in mentoring the parallelization portion of
that list. Are these other ideas orphans, or is someone available
that could discuss the background and needs of the community around
these ideas (and/or mentor...) Thanks!

I'm the one who added these idea to the list as I see that this is one of the reasons colleagues do not adopt GRASS. However, I'm not an expert in the matter and am not sure I would be very helpful as mentor (although I'm willing to try).

Concerning the ideas:

4. Currently GRASS does not provide any image segmentation as such. i.smap contains image segmentation in its process, but the user cannot get segmented outputs. Many algorithms exist and its an ongoing field of research. FLOSS software that provide such algorithms include Orfeo Toolbox (OTB), SAGA, R, Sextante (?) and probably a whole series of others. I think the implementation of a series of such algorithms could be a project on its own.

5. One of the main applications of image segmentation today is in region-based classification of very high resolution imagery. As with current resolutions individual objects are composed of many pixels, it is often more efficient to first identify "objects" or homogeneous multi-pixel regions in the image through segmentation and then to classify these regions. OTB provides this I think, but I don't know if any other FLOSS software does. 5 depends on 4, so it is only possible if 4. is limited to the strict minimum in terms of segmentation algorithms and then focus is put on 5. Maybe a bit too ambitious.

6. In the current classification algorithms in GRASS each designated class of pixels is on the same hierarchical level as others. However, it is often interesting to provide the option to classify an image first in a rough manner into a series of base classes (built-up, vegetation, naked soils) and then to refine classification within each of these classes (e.g. built-up into high-density / low-density, vegetation into forest, grasslands, etc), but to keep the hierarchy, i.e. to allow extracting an image (and a legend) of the classification at each level.

Hope this helps and maybe motivates others to join-in as mentors.

Moritz

On Mon, Apr 2, 2012 at 1:42 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:
...

4. Currently GRASS does not provide any image segmentation as such. i.smap
contains image segmentation in its process, but the user cannot get
segmented outputs.

Just a small comment - even not this one?
http://grass.osgeo.org/wiki/GRASS_AddOns#r.seg
r.seg performs image segmentation and discontinuity detection (based on
the Mumford-Shah variational model).

Perhaps it could be extended?

Markus

On 02/04/12 22:03, Markus Neteler wrote:

On Mon, Apr 2, 2012 at 1:42 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:
...

4. Currently GRASS does not provide any image segmentation as such. i.smap
contains image segmentation in its process, but the user cannot get
segmented outputs.

Just a small comment - even not this one?
http://grass.osgeo.org/wiki/GRASS_AddOns#r.seg
r.seg performs image segmentation and discontinuity detection (based on
the Mumford-Shah variational model).

Ah, another GRASS module I didn't know about. :wink:

Perhaps it could be extended?

I'll have a look at it.

Moritz

On 03/04/12 10:23, Moritz Lennert wrote:

On 02/04/12 22:03, Markus Neteler wrote:

On Mon, Apr 2, 2012 at 1:42 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:
...

4. Currently GRASS does not provide any image segmentation as such.
i.smap
contains image segmentation in its process, but the user cannot get
segmented outputs.

Just a small comment - even not this one?
http://grass.osgeo.org/wiki/GRASS_AddOns#r.seg
r.seg performs image segmentation and discontinuity detection (based on
the Mumford-Shah variational model).

Ah, another GRASS module I didn't know about. :wink:

Perhaps it could be extended?

I'll have a look at it.

I did a rapid test with a landsat image:

for lambda in 0.01 0.1 1 10 100
   do
     for alpha in 0.01 0.1 1 10 100
       do
          r.seg in_g=L72199024_02420020729_B80@PERMANENT out_u=test_seg_$lambda\_$alpha out_z=test_seg_dis_$lambda\_$alpha lambda=$lambda alpha=$alpha --o
          r.colors -e map=test_seg_$lambda\_$alpha color=grey
          r.colors -e map=test_seg_dis_$lambda\_$alpha color=grey
       done
   done

The lamba=100 and alpha=100 seems to give visually me roughly what I was thinking of, i.e. a series of regions. However, each pixel still has a different value and so I would have to go through more steps before I could try to delineate actual polygons which I could then classify.

AFAIU (and this is still a bit limited) r.seg seems to be more oriented towards visualisation than towards further treatment of identified "segments".

Moritz

Thanks for the input! I added some more questions below.

Also, later today I will probably post to R-Sig-Geo (or some other
list?) to ask what tools people currently use, and what areas they
would find most helpful to be integrated into GRASS. That might help
focus where I should spend the time.

On Mon, Apr 2, 2012 at 6:42 AM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 30/03/12 23:56, Eric Momsen wrote:

I am reading more about image classification (from
http://grass.osgeo.org/wiki/GRASS_SoC_Ideas ):

4. Implement image segmentation algorithms and tools
5. Implement region-based classification
6. Implement hierarchical classification tools (e.g. being able to
create a large class "forest", with subclasses of different types of
forests)

...

Concerning the ideas:

4. Currently GRASS does not provide any image segmentation as such. i.smap
contains image segmentation in its process, but the user cannot get
segmented outputs. Many algorithms exist and its an ongoing field of
research. FLOSS software that provide such algorithms include Orfeo Toolbox
(OTB), SAGA, R, Sextante (?) and probably a whole series of others. I think
the implementation of a series of such algorithms could be a project on its
own.

Does it make more sense to implement the algorithms again, or pick the
most useful that are implemented in some other FLOSS and provide an
easy integration to access them from the GRASS front end? (I'm
thinking of v.krige which uses existing R packages to do the
processing work.)

Has Sextante or OTB been tied into GRASS in this manner?

5. One of the main applications of image segmentation today is in
region-based classification of very high resolution imagery. As with current
resolutions individual objects are composed of many pixels, it is often more
efficient to first identify "objects" or homogeneous multi-pixel regions in
the image through segmentation and then to classify these regions. OTB
provides this I think, but I don't know if any other FLOSS software does. 5
depends on 4, so it is only possible if 4. is limited to the strict minimum
in terms of segmentation algorithms and then focus is put on 5. Maybe a bit
too ambitious.

If I can use the OTB implementation from GRASS, then I will include
this as a stretch goal if time remains at the end of the summer.

6. In the current classification algorithms in GRASS each designated class
of pixels is on the same hierarchical level as others. However, it is often
interesting to provide the option to classify an image first in a rough
manner into a series of base classes (built-up, vegetation, naked soils) and
then to refine classification within each of these classes (e.g. built-up
into high-density / low-density, vegetation into forest, grasslands, etc),
but to keep the hierarchy, i.e. to allow extracting an image (and a legend)
of the classification at each level.

This sounds interesting as well. It seems I should propose either 4/5
or 6 for the summer work.

Hope this helps and maybe motivates others to join-in as mentors.

Moritz

On 03/04/12 16:12, Eric Momsen wrote:

Thanks for the input! I added some more questions below.

Also, later today I will probably post to R-Sig-Geo (or some other
list?) to ask what tools people currently use, and what areas they
would find most helpful to be integrated into GRASS. That might help
focus where I should spend the time.

On Mon, Apr 2, 2012 at 6:42 AM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 30/03/12 23:56, Eric Momsen wrote:

I am reading more about image classification (from
http://grass.osgeo.org/wiki/GRASS_SoC_Ideas ):

4. Implement image segmentation algorithms and tools
5. Implement region-based classification
6. Implement hierarchical classification tools (e.g. being able to
create a large class "forest", with subclasses of different types of
forests)

...

Concerning the ideas:

4. Currently GRASS does not provide any image segmentation as such. i.smap
contains image segmentation in its process, but the user cannot get
segmented outputs. Many algorithms exist and its an ongoing field of
research. FLOSS software that provide such algorithms include Orfeo Toolbox
(OTB), SAGA, R, Sextante (?) and probably a whole series of others. I think
the implementation of a series of such algorithms could be a project on its
own.

Does it make more sense to implement the algorithms again, or pick the
most useful that are implemented in some other FLOSS and provide an
easy integration to access them from the GRASS front end? (I'm
thinking of v.krige which uses existing R packages to do the
processing work.)

Has Sextante or OTB been tied into GRASS in this manner?

I personally am a bit weary of increasing dependencies between packages, but at the same time, why re-invent the wheel. Integrating the OTB algorithms into GRASS would definitely be a great plus.

This said, several FOSS4G programs out there already play the role of integrators (e.g. QGIS, gvSIG) and I'm not sure that GRASS should try to go the same direction. Generally, I'd say: let GRASS do really well what it does, and not try to integrate everything.

So, the discussion boils down to: what do we think should be integral part of GRASS (the same question is true for the proposal concerning a PostGIS manager).

5. One of the main applications of image segmentation today is in
region-based classification of very high resolution imagery. As with current
resolutions individual objects are composed of many pixels, it is often more
efficient to first identify "objects" or homogeneous multi-pixel regions in
the image through segmentation and then to classify these regions. OTB
provides this I think, but I don't know if any other FLOSS software does. 5
depends on 4, so it is only possible if 4. is limited to the strict minimum
in terms of segmentation algorithms and then focus is put on 5. Maybe a bit
too ambitious.

If I can use the OTB implementation from GRASS, then I will include
this as a stretch goal if time remains at the end of the summer.

As always, we should check how much integrating OTB means in terms of added dependencies for GRASS. Also, OTB is C++ which AFAICT tell often spells trouble. Then again, there is a Python wrapper which you could use.

Don't know what others think of all this.

Moritz

On Tue, Apr 3, 2012 at 9:57 AM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

I personally am a bit weary of increasing dependencies between packages, but
at the same time, why re-invent the wheel. Integrating the OTB algorithms
into GRASS would definitely be a great plus.

This said, several FOSS4G programs out there already play the role of
integrators (e.g. QGIS, gvSIG) and I'm not sure that GRASS should try to go
the same direction. Generally, I'd say: let GRASS do really well what it
does, and not try to integrate everything.

This does sound critical to me. Being "new" to GIS, I have been
struggling to figure out what the differences (strength/weakness/etc)
are between GRASS, QGIS, etc. I started with GRASS and R and haven't
had time to try out the other programs, so am only comparing based on
what each website describes. I haven't found any feature comparison
table. So I am interested to hear what everyone has to say about what
packages should (or should not) be integrated into GRASS vs. what
packages GRASS is integrated into...but I suspect it is a bigger issue
than can be answered in a GSoC project. :slight_smile: