[Geoserver-devel] New (old?) community module: importer

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Hey Andrea,

I am going to bring this up with folks internally to see what the general consensus is. But I can say that there is definite interest in pushing back the code we have and opening it up to wider collaboration.

Indeed what we call the importer now has very little resemblance to the code you originally wrote. But the end goal of it is indeed the same, batch import into GeoServer.

More to come soon.

-Justin

···

On Mon, Jun 3, 2013 at 7:13 AM, Andrea Aime <andrea.aime@anonymised.com…> wrote:

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Get 100% visibility into Java/.NET code with AppDynamics Lite
It’s a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Ok, we had a chat about this internally. As I said there certainly interest in trying to push back what we have, but I think before we decide on anything some technical discussion should probably occur first. So I’ll start by describing what we have now since as i mentioned it is quite different from the version that existed at one time in the geoserver community space.

It is basically broken down into 3 major pieces, which are currently all bundled into a single module. The first is the core importer itself. This is essentially the engine that processes data in batch, and interacts with the catalog. This piece contains all the supported formats, etc… and has a lot of code to deal with special cases of translating between formats, etc… Which I think is a pretty major deviation from the original code. It allows for what we call “direct import” vs “ingest”. Direct import maps to the original code where we simply are doing batch configuration. Ingest refers to doing batch processing and transforming to a different format. Ie importing a bunch of shapefiles into a postgis database.

As well it supports more specialized workflows for things like mosaics, allowing for users to import a bunch of imagery and have it be grouped into a single mosaic. And some support for attaching timestamps to the individual granules in the mosaic. It also processes imports asynchronously so there is some basic job/task management going on there.

The next biggest piece is a rest api that sits on top of the core engine. The requirements for this api have been driven for the most part by the two projects that are the biggest users of the importer code. Which are mapstory, and geonode.

The last piece is a wicket user interface that is similar to what was originally developed, mostly with changes to make it more user friendly.

Another major difference from the code as it was before is that there is the ability to persist imports. The use case being to be able to handle the case of geoserver processing an import and then going down, but coming back up and be able to continue processing of the import without forcing the client to resubmit. Persistence is achieved with an internal bdb database. However by default no persistence is configured, with the most recent 100 imports simply being stored in memory.

So in pushing back code one thing I think we would want to do is modularize the code that we have, breaking it up into 4 modules at a minimum.

  1. core importer engine
  2. rest api
  3. wicket ui
  4. bdb persistence

Which isn’t a huge amount of work, but it’s work.

So, all that said I am interested to here your thoughts on this design / architecture and how you envision seeing this fit in with the code you guys have.

-Justin

···

On Mon, Jun 3, 2013 at 1:25 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hey Andrea,

I am going to bring this up with folks internally to see what the general consensus is. But I can say that there is definite interest in pushing back the code we have and opening it up to wider collaboration.

Indeed what we call the importer now has very little resemblance to the code you originally wrote. But the end goal of it is indeed the same, batch import into GeoServer.

More to come soon.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Jun 3, 2013 at 7:13 AM, Andrea Aime <andrea.aime@anonymised.com1268…> wrote:

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Get 100% visibility into Java/.NET code with AppDynamics Lite
It’s a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Did you consider using Spring Batch

http://static.springsource.org/spring-batch/

I am currently working with this framework implementing a couple of batch jobs (e.g importing shape files into DB2 tables, exporting tables to shape files, create generalized geometries,…)

At the moment, it seems to work well.

Just an idea.

Christian

···

2013/6/4 Justin Deoliveira <jdeolive@anonymised.com>

Ok, we had a chat about this internally. As I said there certainly interest in trying to push back what we have, but I think before we decide on anything some technical discussion should probably occur first. So I’ll start by describing what we have now since as i mentioned it is quite different from the version that existed at one time in the geoserver community space.

It is basically broken down into 3 major pieces, which are currently all bundled into a single module. The first is the core importer itself. This is essentially the engine that processes data in batch, and interacts with the catalog. This piece contains all the supported formats, etc… and has a lot of code to deal with special cases of translating between formats, etc… Which I think is a pretty major deviation from the original code. It allows for what we call “direct import” vs “ingest”. Direct import maps to the original code where we simply are doing batch configuration. Ingest refers to doing batch processing and transforming to a different format. Ie importing a bunch of shapefiles into a postgis database.

As well it supports more specialized workflows for things like mosaics, allowing for users to import a bunch of imagery and have it be grouped into a single mosaic. And some support for attaching timestamps to the individual granules in the mosaic. It also processes imports asynchronously so there is some basic job/task management going on there.

The next biggest piece is a rest api that sits on top of the core engine. The requirements for this api have been driven for the most part by the two projects that are the biggest users of the importer code. Which are mapstory, and geonode.

The last piece is a wicket user interface that is similar to what was originally developed, mostly with changes to make it more user friendly.

Another major difference from the code as it was before is that there is the ability to persist imports. The use case being to be able to handle the case of geoserver processing an import and then going down, but coming back up and be able to continue processing of the import without forcing the client to resubmit. Persistence is achieved with an internal bdb database. However by default no persistence is configured, with the most recent 100 imports simply being stored in memory.

So in pushing back code one thing I think we would want to do is modularize the code that we have, breaking it up into 4 modules at a minimum.

  1. core importer engine
  2. rest api
  3. wicket ui
  4. bdb persistence

Which isn’t a huge amount of work, but it’s work.

So, all that said I am interested to here your thoughts on this design / architecture and how you envision seeing this fit in with the code you guys have.

-Justin


How ServiceNow helps IT people transform IT departments:

  1. A cloud service to automate IT design, transition and operations
  2. Dashboards that offer high-level views of enterprise services
  3. A single system of record for all IT processes
    http://p.sf.net/sfu/servicenow-d2d-j

Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

DI Christian Mueller MSc (GIS), MSc (IT-Security)
OSS Open Source Solutions GmbH

On Mon, Jun 3, 2013 at 1:25 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hey Andrea,

I am going to bring this up with folks internally to see what the general consensus is. But I can say that there is definite interest in pushing back the code we have and opening it up to wider collaboration.

Indeed what we call the importer now has very little resemblance to the code you originally wrote. But the end goal of it is indeed the same, batch import into GeoServer.

More to come soon.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Jun 3, 2013 at 7:13 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Get 100% visibility into Java/.NET code with AppDynamics Lite
It’s a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

I fully agree, spring batch is a great tool.

Regards Giuseppe

···

2013/6/5 Christian Mueller <christian.mueller@anonymised.com>

Did you consider using Spring Batch

http://static.springsource.org/spring-batch/

I am currently working with this framework implementing a couple of batch jobs (e.g importing shape files into DB2 tables, exporting tables to shape files, create generalized geometries,…)

At the moment, it seems to work well.

Just an idea.

Christian


How ServiceNow helps IT people transform IT departments:

  1. A cloud service to automate IT design, transition and operations
  2. Dashboards that offer high-level views of enterprise services
  3. A single system of record for all IT processes
    http://p.sf.net/sfu/servicenow-d2d-j

Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Giuseppe La Scaleia
CNR - IMAA
geoSDI
Sviluppo Software

C.da S. Loja
85050 Tito Scalo - POTENZA (PZ)
Italia

phone: +39 0971427305
fax: +39 0971 427271
mob: +39 3666373220
mail: giuseppe.lascaleia@anonymised.com
skype: glascaleia

web: http://www.geosdi.org

2013/6/4 Justin Deoliveira <jdeolive@anonymised.com>

Ok, we had a chat about this internally. As I said there certainly interest in trying to push back what we have, but I think before we decide on anything some technical discussion should probably occur first. So I’ll start by describing what we have now since as i mentioned it is quite different from the version that existed at one time in the geoserver community space.

It is basically broken down into 3 major pieces, which are currently all bundled into a single module. The first is the core importer itself. This is essentially the engine that processes data in batch, and interacts with the catalog. This piece contains all the supported formats, etc… and has a lot of code to deal with special cases of translating between formats, etc… Which I think is a pretty major deviation from the original code. It allows for what we call “direct import” vs “ingest”. Direct import maps to the original code where we simply are doing batch configuration. Ingest refers to doing batch processing and transforming to a different format. Ie importing a bunch of shapefiles into a postgis database.

As well it supports more specialized workflows for things like mosaics, allowing for users to import a bunch of imagery and have it be grouped into a single mosaic. And some support for attaching timestamps to the individual granules in the mosaic. It also processes imports asynchronously so there is some basic job/task management going on there.

The next biggest piece is a rest api that sits on top of the core engine. The requirements for this api have been driven for the most part by the two projects that are the biggest users of the importer code. Which are mapstory, and geonode.

The last piece is a wicket user interface that is similar to what was originally developed, mostly with changes to make it more user friendly.

Another major difference from the code as it was before is that there is the ability to persist imports. The use case being to be able to handle the case of geoserver processing an import and then going down, but coming back up and be able to continue processing of the import without forcing the client to resubmit. Persistence is achieved with an internal bdb database. However by default no persistence is configured, with the most recent 100 imports simply being stored in memory.

So in pushing back code one thing I think we would want to do is modularize the code that we have, breaking it up into 4 modules at a minimum.

  1. core importer engine
  2. rest api
  3. wicket ui
  4. bdb persistence

Which isn’t a huge amount of work, but it’s work.

So, all that said I am interested to here your thoughts on this design / architecture and how you envision seeing this fit in with the code you guys have.

-Justin


How ServiceNow helps IT people transform IT departments:

  1. A cloud service to automate IT design, transition and operations
  2. Dashboards that offer high-level views of enterprise services
  3. A single system of record for all IT processes
    http://p.sf.net/sfu/servicenow-d2d-j

Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

DI Christian Mueller MSc (GIS), MSc (IT-Security)
OSS Open Source Solutions GmbH

On Mon, Jun 3, 2013 at 1:25 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hey Andrea,

I am going to bring this up with folks internally to see what the general consensus is. But I can say that there is definite interest in pushing back the code we have and opening it up to wider collaboration.

Indeed what we call the importer now has very little resemblance to the code you originally wrote. But the end goal of it is indeed the same, batch import into GeoServer.

More to come soon.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Jun 3, 2013 at 7:13 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Get 100% visibility into Java/.NET code with AppDynamics Lite
It’s a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Interesting. I did look at spring batch but this was probably over two years ago and found the api documentation kind of lacking. I couldn’t find anything aside from undocumented code that showed how to manage jobs programatically. Only declaratively in spring xml.

So for the time being i decided to stick with something simpler straight with the java executor service api.

However more advanced task management is something we have discussed so perhaps I’ll have to give spring batch another look.

···

On Wed, Jun 5, 2013 at 4:28 AM, Giuseppe La Scaleia <giuseppe.lascaleia@anonymised.com> wrote:

I fully agree, spring batch is a great tool.

Regards Giuseppe


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

2013/6/5 Christian Mueller <christian.mueller@anonymised.com>

Did you consider using Spring Batch

http://static.springsource.org/spring-batch/

I am currently working with this framework implementing a couple of batch jobs (e.g importing shape files into DB2 tables, exporting tables to shape files, create generalized geometries,…)

At the moment, it seems to work well.

Just an idea.

Christian


How ServiceNow helps IT people transform IT departments:

  1. A cloud service to automate IT design, transition and operations
  2. Dashboards that offer high-level views of enterprise services
  3. A single system of record for all IT processes
    http://p.sf.net/sfu/servicenow-d2d-j

Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Giuseppe La Scaleia
CNR - IMAA
geoSDI
Sviluppo Software

C.da S. Loja
85050 Tito Scalo - POTENZA (PZ)
Italia

phone: +39 0971427305
fax: +39 0971 427271
mob: +39 3666373220
mail: giuseppe.lascaleia@anonymised.com
skype: glascaleia

web: http://www.geosdi.org

2013/6/4 Justin Deoliveira <jdeolive@anonymised.com>

Ok, we had a chat about this internally. As I said there certainly interest in trying to push back what we have, but I think before we decide on anything some technical discussion should probably occur first. So I’ll start by describing what we have now since as i mentioned it is quite different from the version that existed at one time in the geoserver community space.

It is basically broken down into 3 major pieces, which are currently all bundled into a single module. The first is the core importer itself. This is essentially the engine that processes data in batch, and interacts with the catalog. This piece contains all the supported formats, etc… and has a lot of code to deal with special cases of translating between formats, etc… Which I think is a pretty major deviation from the original code. It allows for what we call “direct import” vs “ingest”. Direct import maps to the original code where we simply are doing batch configuration. Ingest refers to doing batch processing and transforming to a different format. Ie importing a bunch of shapefiles into a postgis database.

As well it supports more specialized workflows for things like mosaics, allowing for users to import a bunch of imagery and have it be grouped into a single mosaic. And some support for attaching timestamps to the individual granules in the mosaic. It also processes imports asynchronously so there is some basic job/task management going on there.

The next biggest piece is a rest api that sits on top of the core engine. The requirements for this api have been driven for the most part by the two projects that are the biggest users of the importer code. Which are mapstory, and geonode.

The last piece is a wicket user interface that is similar to what was originally developed, mostly with changes to make it more user friendly.

Another major difference from the code as it was before is that there is the ability to persist imports. The use case being to be able to handle the case of geoserver processing an import and then going down, but coming back up and be able to continue processing of the import without forcing the client to resubmit. Persistence is achieved with an internal bdb database. However by default no persistence is configured, with the most recent 100 imports simply being stored in memory.

So in pushing back code one thing I think we would want to do is modularize the code that we have, breaking it up into 4 modules at a minimum.

  1. core importer engine
  2. rest api
  3. wicket ui
  4. bdb persistence

Which isn’t a huge amount of work, but it’s work.

So, all that said I am interested to here your thoughts on this design / architecture and how you envision seeing this fit in with the code you guys have.

-Justin


How ServiceNow helps IT people transform IT departments:

  1. A cloud service to automate IT design, transition and operations
  2. Dashboards that offer high-level views of enterprise services
  3. A single system of record for all IT processes
    http://p.sf.net/sfu/servicenow-d2d-j

Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

DI Christian Mueller MSc (GIS), MSc (IT-Security)
OSS Open Source Solutions GmbH

On Mon, Jun 3, 2013 at 1:25 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hey Andrea,

I am going to bring this up with folks internally to see what the general consensus is. But I can say that there is definite interest in pushing back the code we have and opening it up to wider collaboration.

Indeed what we call the importer now has very little resemblance to the code you originally wrote. But the end goal of it is indeed the same, batch import into GeoServer.

More to come soon.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Jun 3, 2013 at 7:13 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
as you might remember a few years ago we had for a short time a “importer” extension in community land that allowed to import and configure several shapefiles in a row.

The module was eventually pulled due to lack of maintainership, and kept on being developed as part of the OpenGeo suite.

Some time ago in a GeoSolutions project we needed a mass import functionality, so we took an early version of the importer (the one that was available in public svn repos at the times I was still working for OpenGeo) and improved it, in minor ways, but also adding a major new functionality: the ability to import a folder of geotiff files, while re-tile and embed overviews into them in the process.

We believe this bit of functionality is better made available to the public, so we would like to re-import it into GeoServer as a community module.
I believe OpenGeo continued to improve the module in a separate code base, so, if there is any interest, we’re pretty open to joint collaborations on the new community module (that is of course true in general, as usual anyone is welcomed to pitch in an improve the code).

As a personal pet peeve of mine, time allowing and with no set plans, I would like to fold back the importer into the main GeoServer, the way I originally devised it: as a natural step in the “new layer” workflow, in which one can choose several layers to be created, instead of being limited to just one.
With the directory datastore that popped up in the meantime, and a possible feature “directory of geotiff” store, the thing would become rather natural.
But as I said, no set plans on this, and I guess that before being merged into core the module should prove itself though the usual community → extension graduation process

Cheers
Andrea

==
GeoServer training in Milan, 6th & 7th June 2013! Visit http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Get 100% visibility into Java/.NET code with AppDynamics Lite
It’s a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Tue, Jun 4, 2013 at 6:55 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

It is basically broken down into 3 major pieces, which are currently all
bundled into a single module. The first is the core importer itself. This
is essentially the engine that processes data in batch, and interacts with
the catalog. This piece contains all the supported formats, etc... and has
a lot of code to deal with special cases of translating between formats,
etc... Which I think is a pretty major deviation from the original code. It
allows for what we call "direct import" vs "ingest". Direct import maps to
the original code where we simply are doing batch configuration. Ingest
refers to doing batch processing and transforming to a different format. Ie
importing a bunch of shapefiles into a postgis database.

Right. I guess you have quite a bit of "ugly" code in there to handle the
specific issue of each and every target datastore (e.g., Oracle madness
with short and uppercase identifiers, and so on). My original intention was
to improve DataStore.createSchema() at the GeoTools level so that it
returns a mapping between what was requested, and what was actually
created... but of course it's quite a bit of work and requires a proposal.

As well it supports more specialized workflows for things like mosaics,
allowing for users to import a bunch of imagery and have it be grouped into
a single mosaic. And some support for attaching timestamps to the
individual granules in the mosaic. It also processes imports asynchronously
so there is some basic job/task management going on there.

Nice. How hard to you believe it would be to add retiling/overview
embedding to that? The code we're using it the geotools one as far as I
remember (Simone please correct me if I'm wrong).

The next biggest piece is a rest api that sits on top of the core engine.
The requirements for this api have been driven for the most part by the two
projects that are the biggest users of the importer code. Which are
mapstory, and geonode.

A rest API is surely intersting too

The last piece is a wicket user interface that is similar to what was
originally developed, mostly with changes to make it more user friendly.

Cool

Another major difference from the code as it was before is that there is
the ability to persist imports. The use case being to be able to handle the
case of geoserver processing an import and then going down, but coming back
up and be able to continue processing of the import without forcing the
client to resubmit. Persistence is achieved with an internal bdb database.
However by default no persistence is configured, with the most recent 100
imports simply being stored in memory.

Eh, with all the effort gone into getting rid of BDB it would be bad to get
it back but... as long as it's optional, no issues.
Question, how do you handle clusters with such persistence?

So in pushing back code one thing I think we would want to do is
modularize the code that we have, breaking it up into 4 modules at a
minimum.

1. core importer engine
2. rest api
3. wicket ui
4. bdb persistence

Which isn't a huge amount of work, but it's work.

So, all that said I am interested to here your thoughts on this design /
architecture and how you envision seeing this fit in with the code you guys
have.

Well, does not look like it does (fit), I guess we'll have to pretty much
rewrite most of the additions to make it fit with yours I'm afraid. However
I believe it's better to start with the most complete (functionally wise)
version, and try to merge the other bit by bit over time
How about having a look at the code?
The version we wanted to contribute is here:
https://github.com/geosolutions-it/geoserver-enterprise/tree/2.2.x/src/community/importer

Cheers
Andrea

--

GeoServer training in Milan, 6th & 7th June 2013! Visit
http://geoserver.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Hi Justin,
how are things going importer wise?
If OpenGeo thinks it would be hard to contribute the Suite importer that's
a pity, but not the end
of the world, we're still on board with contributing our less evolved
version to GeoServer if needs be.

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Hey Andrea,

Sorry this has been slow on our side talking through everything. But i am happy to report that yes, we have decided we want to contribute the importer code we have. However, there are some wrinkles. Read on but long story short is it going to take us a bit of time to get mapstory caught up to the latest developments we have made to the importer, and before that happens we are not comfortable contributing what we have at risk of having to maintain two code bases. We think this will be on the order of a couple of weeks to a month.

Recently we did a pretty big refactor of the importer code base to try and simply the model nand the rest api. We have succeeded and the code is currently on a branch that has yet to be merged into master and consumed by downstream projects like mapstory. Ian has been working on that but he has many fish to fry at the moment :slight_smile:

We would also like to (at least for the first while) be designated as the primary maintainers of the module. Rationale being that as we have a number of downstream projects depending on this code it helps us manage risk. Secondly we want to be conservative when it comes to backporting functionality to the stable series. Again to manage risk.

So, i leave it up to you. If you can work with all this then we are happy to contribute the code. However if you want to proceed with the code you have then we are happy to step aside and let you do that.

-Justin

···

On Wed, Jun 19, 2013 at 7:38 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hi Justin,
how are things going importer wise?
If OpenGeo thinks it would be hard to contribute the Suite importer that’s a pity, but not the end
of the world, we’re still on board with contributing our less evolved version to GeoServer if needs be.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Sat, Jun 22, 2013 at 4:10 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

Hey Andrea,

Sorry this has been slow on our side talking through everything. But i am
happy to report that yes, we have decided we want to contribute the
importer code we have. However, there are some wrinkles. Read on but long
story short is it going to take us a bit of time to get mapstory caught up
to the latest developments we have made to the importer, and before that
happens we are not comfortable contributing what we have at risk of having
to maintain two code bases. We think this will be on the order of a couple
of weeks to a month.

Recently we did a pretty big refactor of the importer code base to try and
simply the model nand the rest api. We have succeeded and the code is
currently on a branch that has yet to be merged into master and consumed by
downstream projects like mapstory. Ian has been working on that but he has
many fish to fry at the moment :slight_smile:

We would also like to (at least for the first while) be designated as the
primary maintainers of the module. Rationale being that as we have a number
of downstream projects depending on this code it helps us manage risk.
Secondly we want to be conservative when it comes to backporting
functionality to the stable series. Again to manage risk.

Sure, waiting up to a month is not a problem, and neither is a problem with
you designated as the main "maintainer", whatever that means for a
community module... do you plan to push it to extension status in a short
time? That's what we wanted to do with our version of the importer, as
surely there will be interest from the community.

Maybe when we try to merge the extras we added in our version we can be
marked as module developers as well.

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

On Fri, Jun 28, 2013 at 6:48 PM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

On Sat, Jun 22, 2013 at 4:10 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

Hey Andrea,

Sorry this has been slow on our side talking through everything. But i am
happy to report that yes, we have decided we want to contribute the
importer code we have. However, there are some wrinkles. Read on but long
story short is it going to take us a bit of time to get mapstory caught up
to the latest developments we have made to the importer, and before that
happens we are not comfortable contributing what we have at risk of having
to maintain two code bases. We think this will be on the order of a couple
of weeks to a month.

Recently we did a pretty big refactor of the importer code base to try
and simply the model nand the rest api. We have succeeded and the code is
currently on a branch that has yet to be merged into master and consumed by
downstream projects like mapstory. Ian has been working on that but he has
many fish to fry at the moment :slight_smile:

We would also like to (at least for the first while) be designated as the
primary maintainers of the module. Rationale being that as we have a number
of downstream projects depending on this code it helps us manage risk.
Secondly we want to be conservative when it comes to backporting
functionality to the stable series. Again to manage risk.

Sure, waiting up to a month is not a problem, and neither is a problem
with you designated as the main "maintainer", whatever that means for a
community module... do you plan to push it to extension status in a short
time? That's what we wanted to do with our version of the importer, as
surely there will be interest from the community.

Hi Justin,
how are things with the importer module? It would be nice to have it for
2.4.0

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Sorry for lagging on this Andrea. But I finally have the go ahead to push the module in so I will get this done by end of week (famous last words :slight_smile: ). It will take me a bit of time to reorganize the module as we discussed and rename the packages, etc… but that is mostly mechanical stuff.

···

On Tue, Jul 23, 2013 at 12:02 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Jun 28, 2013 at 6:48 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Sat, Jun 22, 2013 at 4:10 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi Justin,
how are things with the importer module? It would be nice to have it for 2.4.0

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Hey Andrea,

Sorry this has been slow on our side talking through everything. But i am happy to report that yes, we have decided we want to contribute the importer code we have. However, there are some wrinkles. Read on but long story short is it going to take us a bit of time to get mapstory caught up to the latest developments we have made to the importer, and before that happens we are not comfortable contributing what we have at risk of having to maintain two code bases. We think this will be on the order of a couple of weeks to a month.

Recently we did a pretty big refactor of the importer code base to try and simply the model nand the rest api. We have succeeded and the code is currently on a branch that has yet to be merged into master and consumed by downstream projects like mapstory. Ian has been working on that but he has many fish to fry at the moment :slight_smile:

We would also like to (at least for the first while) be designated as the primary maintainers of the module. Rationale being that as we have a number of downstream projects depending on this code it helps us manage risk. Secondly we want to be conservative when it comes to backporting functionality to the stable series. Again to manage risk.

Sure, waiting up to a month is not a problem, and neither is a problem with you designated as the main “maintainer”, whatever that means for a community module… do you plan to push it to extension status in a short time? That’s what we wanted to do with our version of the importer, as surely there will be interest from the community.

On Tue, Jul 23, 2013 at 7:27 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

Sorry for lagging on this Andrea. But I finally have the go ahead to push
the module in so I will get this done by end of week (famous last words :slight_smile:
). It will take me a bit of time to reorganize the module as we discussed
and rename the packages, etc... but that is mostly mechanical stuff.

That's great, thanks for the good news Justin!

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

On Tue, Jul 23, 2013 at 10:31 AM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

On Tue, Jul 23, 2013 at 7:27 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

Sorry for lagging on this Andrea. But I finally have the go ahead to push
the module in so I will get this done by end of week (famous last words :slight_smile:
). It will take me a bit of time to reorganize the module as we discussed
and rename the packages, etc... but that is mostly mechanical stuff.

That's great, thanks for the good news Justin!

And here it is.

  https://github.com/geoserver/geoserver/pull/285

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.