[Geoserver-devel] Improving shape-zip output format: customization and traceability

Hi,
I'm looking into making some changes into the shape-zip output format to improve
traceability and output customization and I'd like to pass them by the
community before
going down and implement them.

Traceability of the outputs is important in some circles: people want
the files to contain
information about where the files come from, when they were generated, and how.

To satisfy this requirements we want to make the zip file name
configurable so that
it might contain a company name and the ISO date of the request.
The same goes for the single shapefiles in it, they would have to
contain the company
name, the ISO date of the request and of course the name of the layer.

In order to support traceablity instead I want to add a new file that
contains the actual
WFS request, as a URL in case of a GET request, as a XML dump in case
of a POST request.
This new file will be added to the lot of files contained in the zip file.

To satisfy the above and keep a general approach I want to roll out a
handful of new
freemarker templates:
- shapezip.ftl: will generate the zip file name given the request
time, the list of layers requested
- shapefile.ftl: will generate the shapefile name given the request
time and the layer name
- shapedesc.ftl: will generate the request description file name

People can then place those freemarker in the usual workspace
hierarchy (for the shapezip.ftl
we'll use the first layer as a reference in case of a multilayer request).

If no templates are to be found the code will just use the current
naming strategies and name
the request description file as request.txt

Opinions?

Cheers
Andrea

-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

I like the idea. It would be nice to however imo not to explode the number of templates. Could you pull it off with a single template? I guess for name of the zip file and the name of the shapefile you won’t have new lines so you could use the first two lines for that. And then the remaining lines for the description? Just a suggestion.

On Mon, Oct 25, 2010 at 5:00 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
I’m looking into making some changes into the shape-zip output format to improve
traceability and output customization and I’d like to pass them by the
community before
going down and implement them.

Traceability of the outputs is important in some circles: people want
the files to contain
information about where the files come from, when they were generated, and how.

To satisfy this requirements we want to make the zip file name
configurable so that
it might contain a company name and the ISO date of the request.
The same goes for the single shapefiles in it, they would have to
contain the company
name, the ISO date of the request and of course the name of the layer.

In order to support traceablity instead I want to add a new file that
contains the actual
WFS request, as a URL in case of a GET request, as a XML dump in case
of a POST request.
This new file will be added to the lot of files contained in the zip file.

To satisfy the above and keep a general approach I want to roll out a
handful of new
freemarker templates:

  • shapezip.ftl: will generate the zip file name given the request
    time, the list of layers requested
  • shapefile.ftl: will generate the shapefile name given the request
    time and the layer name
  • shapedesc.ftl: will generate the request description file name

People can then place those freemarker in the usual workspace
hierarchy (for the shapezip.ftl
we’ll use the first layer as a reference in case of a multilayer request).

If no templates are to be found the code will just use the current
naming strategies and name
the request description file as request.txt

Opinions?

Cheers
Andrea


Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Oct 25, 2010 at 4:21 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

I like the idea. It would be nice to however imo not to explode the number
of templates. Could you pull it off with a single template? I guess for name
of the zip file and the name of the shapefile you won't have new lines so
you could use the first two lines for that. And then the remaining lines for
the description? Just a suggestion.

Ah, a single template with three subsequent lines... may look like this:
<company>-<date>
<company>-<date>-<layer>
<company>-<date>

first line for the zip file, second for the single shapefile, third
for the request dump?

May actually work. I like the fact that you have just one file to deal
with. I'm a bit
worried about users wondering how to just specify, for example, the
shapefile name
(you'd have to leave the first line empty or something like that).

Or maybe it could be a templated property file:
zip=<company>-<date>
shp=<company>-<date>-<layer>
req=<company>-<date>

That way we give a name to each line.

Hmmm.... not sure. Having three files does not look so bad to me.
Having recently seen
people struggle with ftl files I am a bit weary to give them more
meaning or structure
than usual.

Cheers
Andrea

On Mon, Oct 25, 2010 at 5:00 AM, Andrea Aime <andrea.aime@anonymised.com>
wrote:

Hi,
I'm looking into making some changes into the shape-zip output format to
improve
traceability and output customization and I'd like to pass them by the
community before
going down and implement them.

Traceability of the outputs is important in some circles: people want
the files to contain
information about where the files come from, when they were generated, and
how.

To satisfy this requirements we want to make the zip file name
configurable so that
it might contain a company name and the ISO date of the request.
The same goes for the single shapefiles in it, they would have to
contain the company
name, the ISO date of the request and of course the name of the layer.

In order to support traceablity instead I want to add a new file that
contains the actual
WFS request, as a URL in case of a GET request, as a XML dump in case
of a POST request.
This new file will be added to the lot of files contained in the zip file.

To satisfy the above and keep a general approach I want to roll out a
handful of new
freemarker templates:
- shapezip.ftl: will generate the zip file name given the request
time, the list of layers requested
- shapefile.ftl: will generate the shapefile name given the request
time and the layer name
- shapedesc.ftl: will generate the request description file name

People can then place those freemarker in the usual workspace
hierarchy (for the shapezip.ftl
we'll use the first layer as a reference in case of a multilayer request).

If no templates are to be found the code will just use the current
naming strategies and name
the request description file as request.txt

Opinions?

Cheers
Andrea

-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America
contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in
marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

--
-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

Sounds good to me - I’m quite in to automatic creation of metadata.

One thing we could consider is making a shapefile.xml - the ESRI metadata format thing. I think we should still have the request.txt, but could also auto populate a metadata file with some of the same information.

Ultimately what I’d be interested in is a way to more effectively populate a shapefile.xml. In GeoNode we store a lot of metadata and give users a way to edit it all through the web, along with extracting from their user profile. We use GeoServer for all our shapefile generation, so it’d be nice to be able to get it to also produce a shapefile.xml.

But that’s obviously not immediately relevant to this, so +1 on these changes, though agree it could be nicer to be in one file.

On Mon, Oct 25, 2010 at 6:00 PM, Andrea Aime <andrea.aime@anonymised.com268…> wrote:

Hi,
I’m looking into making some changes into the shape-zip output format to improve
traceability and output customization and I’d like to pass them by the
community before
going down and implement them.

Traceability of the outputs is important in some circles: people want
the files to contain
information about where the files come from, when they were generated, and how.

To satisfy this requirements we want to make the zip file name
configurable so that
it might contain a company name and the ISO date of the request.
The same goes for the single shapefiles in it, they would have to
contain the company
name, the ISO date of the request and of course the name of the layer.

In order to support traceablity instead I want to add a new file that
contains the actual
WFS request, as a URL in case of a GET request, as a XML dump in case
of a POST request.
This new file will be added to the lot of files contained in the zip file.

To satisfy the above and keep a general approach I want to roll out a
handful of new
freemarker templates:

  • shapezip.ftl: will generate the zip file name given the request
    time, the list of layers requested
  • shapefile.ftl: will generate the shapefile name given the request
    time and the layer name
  • shapedesc.ftl: will generate the request description file name

People can then place those freemarker in the usual workspace
hierarchy (for the shapezip.ftl
we’ll use the first layer as a reference in case of a multilayer request).

If no templates are to be found the code will just use the current
naming strategies and name
the request description file as request.txt

Opinions?

Cheers
Andrea


Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On Tue, Oct 26, 2010 at 4:24 AM, Chris Holmes <cholmes@anonymised.com> wrote:

Sounds good to me - I'm quite in to automatic creation of metadata.

One thing we could consider is making a shapefile.xml - the ESRI metadata
format thing. I think we should still have the request.txt, but could also
auto populate a metadata file with some of the same information.

Ultimately what I'd be interested in is a way to more effectively populate a
shapefile.xml. In GeoNode we store a lot of metadata and give users a way
to edit it all through the web, along with extracting from their user
profile. We use GeoServer for all our shapefile generation, so it'd be nice
to be able to get it to also produce a shapefile.xml.

Yeah, I guess it would be possible to leverage each feature type
title, abstract,
keywords, srs, bbox to generate a little xml file. But we would be stuck using
known to GeoServer metadata, wouldn't we? Unless someone adds extra params
to point to external metadata sources.
I found this about the file shp.xml file contents:
http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=Metadata_standards_and_the_ArcGIS_metadata_format
which seems to imply there are many possible structures for this file.

But that's obviously not immediately relevant to this, so +1 on these
changes, though agree it could be nicer to be in one file.

Cool

Cheers
Andrea

-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

MassGIS does add extra params to point to external metadata sources. :slight_smile:
Here's what we do: In the keywords section we have one metadata URL and
zero or more extract doc URLs. Our main viewer client software OLIVER is programmed to grab the content at these URLs and bundle it with the shapefile extracted from GeoServer.

For example, our layer massgis:GISDATA.SCHOOLS_PT has these keywords:

MassgisMetadataUrl=http://www.mass.gov/mgis/hospitals.htm
ExtractDoc=http://maps.massgis.state.ma.us/metadata/GISDATA.HOSPITALS_PT.xml
ExtractDoc=http://maps.massgis.state.ma.us/avls/hosp_er.avl
ExtractDoc=http://maps.massgis.state.ma.us/lyrs/Hospitals.lyr

The .htm is easy human readable metadata, the .xml is a shapefile XML metadata exported from SDE, the .avl is an old ESRI ArcView 3.x symbolization file (probably no one wants these anymore but I haven't taken the time to strip the keywords out), and the .lyr is an ESRI ArcMap symbolization file.

We wanted to do this because we dislike distributing data without any metadata. We could expand later or other content - Word or PDF docs or videos we feel they are essential to "go with" the dataset.

I think GEOS-4199 is great, but I'd be concerned about a name conflict with the ESRI XML metadata file. I think we'd like to keep layer-name-in-sde.xml, because I think ESRI ArcMap automatically looks for that naming scheme. Maybe the GeoServer metadata version could be called something like GS.GISDATA.HOSPITALS_PT.xml? GS prefix for GeoServer? Just a thought.

On Tue, Oct 26, 2010 at 3:03 PM, Freeman, Aleda (EEA)
<Aleda.Freeman@anonymised.com> wrote:

MassGIS does add extra params to point to external metadata sources. :slight_smile:
Here's what we do: In the keywords section we have one metadata URL and
zero or more extract doc URLs. Our main viewer client software OLIVER is programmed to grab the content at these URLs and bundle it with the shapefile extracted from GeoServer.

For example, our layer massgis:GISDATA.SCHOOLS_PT has these keywords:

MassgisMetadataUrl=http://www.mass.gov/mgis/hospitals.htm
ExtractDoc=http://maps.massgis.state.ma.us/metadata/GISDATA.HOSPITALS_PT.xml
ExtractDoc=http://maps.massgis.state.ma.us/avls/hosp_er.avl
ExtractDoc=http://maps.massgis.state.ma.us/lyrs/Hospitals.lyr

The .htm is easy human readable metadata, the .xml is a shapefile XML metadata exported from SDE, the .avl is an old ESRI ArcView 3.x symbolization file (probably no one wants these anymore but I haven't taken the time to strip the keywords out), and the .lyr is an ESRI ArcMap symbolization file.

We wanted to do this because we dislike distributing data without any metadata. We could expand later or other content - Word or PDF docs or videos we feel they are essential to "go with" the dataset.

I think GEOS-4199 is great, but I'd be concerned about a name conflict with the ESRI XML metadata file. I think we'd like to keep layer-name-in-sde.xml, because I think ESRI ArcMap automatically looks for that naming scheme. Maybe the GeoServer metadata version could be called something like GS.GISDATA.HOSPITALS_PT.xml? GS prefix for GeoServer? Just a thought.

GEOS-4199 is not going to add any metadata, we're just chatting about
possible extensions to the traceability concept.
If/when someone implements that another jira will be opened I guess? I
have no need for it at the moment, however anyone
interest in this extra functionality is welcomed to join the part and
add a separate patch :slight_smile:

Btw, I thought the ESRI metadata naming convention was .shp.xml, but
you say it's plain .xml instead?

Cheers
Andrea

--
-----------------------------------------------------
Ing. Andrea Aime
Senior Software Engineer

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-----------------------------------------------------

You are correct - it is .shp.xml. Our keywords are pointing to the SDE XML files and you remind me that I actually am supposed to repoint the links to these shapefile XML files: ftp://data.massgis.state.ma.us/pub/metadata_shp/