[Geoserver-devel] Adding format parameters to WFS GetFeature

Hi,
so I have this request to allow specification of the encoding for the
shapefiles we produce, and I would like to discuss the how with the other devs.

The problem boils down to the shapefile data store using the default
language for encoding shapefiles, and this may not be any good to
encode the shapefile. For example, the platform may be
setup to use UTF-8, but if you need to encode any non ASCII character
that would create a shapefile that very few software can use.

For chinese you'll need to encode it in GB2312, for any european
language in ISO-8859-15, and so on. Depending on the server, you may
have to serve data in various languages. And nothing prevents one
from setting up a server that needs to encode different shapefiles
in different languages.

So we'd need a parameter telling the encoder which charset to use.
I was thinking to port format_options handling to WFS. Opinions?

The main issue I'm seeing is the request object, GetFeatureType, is derived from an EMF model, so we'd need to change the ecore and regenerate the model (hoping the current EMF is backwards compatible,
the current model was generated before Eclipse 3.4).

Suggestions?
Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Hi Andrea,

porting format_options seems good, as long as its advertised in the caps.
BTW, regenerating the code from the emf model in 3.4 _seems to be_ backwards
compatible. At least I had no problem with it doing so for the wfs and ows
bindings in geotools trunk.
Gabriel

On Tuesday 18 November 2008 12:12:10 pm Andrea Aime wrote:

Hi,
so I have this request to allow specification of the encoding for the
shapefiles we produce, and I would like to discuss the how with the
other devs.

The problem boils down to the shapefile data store using the default
language for encoding shapefiles, and this may not be any good to
encode the shapefile. For example, the platform may be
setup to use UTF-8, but if you need to encode any non ASCII character
that would create a shapefile that very few software can use.

For chinese you'll need to encode it in GB2312, for any european
language in ISO-8859-15, and so on. Depending on the server, you may
have to serve data in various languages. And nothing prevents one
from setting up a server that needs to encode different shapefiles
in different languages.

So we'd need a parameter telling the encoder which charset to use.
I was thinking to port format_options handling to WFS. Opinions?

The main issue I'm seeing is the request object, GetFeatureType, is
derived from an EMF model, so we'd need to change the ecore and
regenerate the model (hoping the current EMF is backwards compatible,
the current model was generated before Eclipse 3.4).

Suggestions?
Cheers
Andrea

Gabriel Roldan ha scritto:

Hi Andrea,

porting format_options seems good, as long as its advertised in the caps.

Gah, more work. Afaik we're not advertising a single extra param in the
caps, not in WFS, nor in WMS.

BTW, regenerating the code from the emf model in 3.4 _seems to be_ backwards compatible. At least I had no problem with it doing so for the wfs and ows bindings in geotools trunk.

Cool, this time we got lucky (I asked because last time with WCS 1.1 we
had to redo the models, there were changes in the base classes).

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Sounds like a good idea. One question though. Do we plan to support this via POST requests? Adding extensions via GET is easy enough but once you start doing it with xml you have to worry about schemas and what not... it can get messy.

About the model, it should be backwards compatible. Part of the emf model specifies a "compatibility" version, so as long as its set properly there should be no problems. I just did a quick test regenerating the GetFeatureType interface and impl class and no problems.

-Justin

Andrea Aime wrote:

Hi,
so I have this request to allow specification of the encoding for the
shapefiles we produce, and I would like to discuss the how with the other devs.

The problem boils down to the shapefile data store using the default
language for encoding shapefiles, and this may not be any good to
encode the shapefile. For example, the platform may be
setup to use UTF-8, but if you need to encode any non ASCII character
that would create a shapefile that very few software can use.

For chinese you'll need to encode it in GB2312, for any european
language in ISO-8859-15, and so on. Depending on the server, you may
have to serve data in various languages. And nothing prevents one
from setting up a server that needs to encode different shapefiles
in different languages.

So we'd need a parameter telling the encoder which charset to use.
I was thinking to port format_options handling to WFS. Opinions?

The main issue I'm seeing is the request object, GetFeatureType, is derived from an EMF model, so we'd need to change the ecore and regenerate the model (hoping the current EMF is backwards compatible,
the current model was generated before Eclipse 3.4).

Suggestions?
Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

Sounds like a good idea. One question though. Do we plan to support this via POST requests? Adding extensions via GET is easy enough but once you start doing it with xml you have to worry about schemas and what not... it can get messy.

Oh yeah, I was not planning to add explicit changes to the XML POST.
If the user wants to use this feature, they can post to
ows?format_options=shapefileEncoding:ISO-8859-1

and the dispatcher will take care of that.

Mumble... another option that might work is to include the parameter
in the output format, but there would be no way to advertise it:
outputFormat=SHAPE-ZIP;encoding=ISO-8859-1

That would work cleanly with POST as well thought.

Cheers
Andrea

PS: the point of advertising the param is quite moot imho, what
software can actually use advertised params?
We're talking of an extension to a non standard format, no
automatic software can do anything intelligently with it
anyways afaik.

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Andrea Aime wrote:

Justin Deoliveira ha scritto:

Sounds like a good idea. One question though. Do we plan to support this via POST requests? Adding extensions via GET is easy enough but once you start doing it with xml you have to worry about schemas and what not... it can get messy.

Oh yeah, I was not planning to add explicit changes to the XML POST.
If the user wants to use this feature, they can post to
ows?format_options=shapefileEncoding:ISO-8859-1

and the dispatcher will take care of that.

Oh yeah, go dispatcher go! :slight_smile:

Mumble... another option that might work is to include the parameter
in the output format, but there would be no way to advertise it:
outputFormat=SHAPE-ZIP;encoding=ISO-8859-1

That would work cleanly with POST as well thought.

Cheers
Andrea

PS: the point of advertising the param is quite moot imho, what
software can actually use advertised params?
We're talking of an extension to a non standard format, no
automatic software can do anything intelligently with it
anyways afaik.

I tend to agree. I mean there is barely software that does wfs properly, let alone taking advantage of these options. Declaring stuff like this seems like splitting hairs until there is some sort of agreement among clients.

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira escribió:

Andrea Aime wrote:
  

Justin Deoliveira ha scritto:
    

Sounds like a good idea. One question though. Do we plan to support this via POST requests? Adding extensions via GET is easy enough but once you start doing it with xml you have to worry about schemas and what not... it can get messy.
      

Oh yeah, I was not planning to add explicit changes to the XML POST.
If the user wants to use this feature, they can post to
ows?format_options=shapefileEncoding:ISO-8859-1

and the dispatcher will take care of that.
    

Oh yeah, go dispatcher go! :slight_smile:
  

Mumble... another option that might work is to include the parameter
in the output format, but there would be no way to advertise it:
outputFormat=SHAPE-ZIP;encoding=ISO-8859-1

That would work cleanly with POST as well thought.

Cheers
Andrea

PS: the point of advertising the param is quite moot imho, what
software can actually use advertised params?
We're talking of an extension to a non standard format, no
automatic software can do anything intelligently with it
anyways afaik.
    

I tend to agree. I mean there is barely software that does wfs properly, let alone taking advantage of these options. Declaring stuff like this seems like splitting hairs until there is some sort of agreement among clients.
  

Tend to agree too, but then I wonder if at least we can advertise them as different outputformat entries, like
SHAPE-ZIP
SHAPE-ZIP;encoding=UTF-16
SHAPE-ZIP;encoding=ISO-8859-1
SHAPE-ZIP;encoding=....
not that we need an entry for each and every JVM supported charset... we could just add the most used ones in the spring applicationContext if the shape outputformat bean supports a charset parameter... the backside is with formatOptions you get all the available encodings, this way you limit them to a bunch...

just a thought though

Gabriel

Gabriel Roldan ha scritto:

Justin Deoliveira escribió:

Andrea Aime wrote:

Justin Deoliveira ha scritto:
   

Sounds like a good idea. One question though. Do we plan to support this via POST requests? Adding extensions via GET is easy enough but once you start doing it with xml you have to worry about schemas and what not... it can get messy.
      

Oh yeah, I was not planning to add explicit changes to the XML POST.
If the user wants to use this feature, they can post to
ows?format_options=shapefileEncoding:ISO-8859-1

and the dispatcher will take care of that.
    

Oh yeah, go dispatcher go! :slight_smile:

Mumble... another option that might work is to include the parameter
in the output format, but there would be no way to advertise it:
outputFormat=SHAPE-ZIP;encoding=ISO-8859-1

That would work cleanly with POST as well thought.

Cheers
Andrea

PS: the point of advertising the param is quite moot imho, what
software can actually use advertised params?
We're talking of an extension to a non standard format, no
automatic software can do anything intelligently with it
anyways afaik.
    

I tend to agree. I mean there is barely software that does wfs properly, let alone taking advantage of these options. Declaring stuff like this seems like splitting hairs until there is some sort of agreement among clients.
  

Tend to agree too, but then I wonder if at least we can advertise them as different outputformat entries, like
SHAPE-ZIP
SHAPE-ZIP;encoding=UTF-16
SHAPE-ZIP;encoding=ISO-8859-1
SHAPE-ZIP;encoding=....
not that we need an entry for each and every JVM supported charset... we could just add the most used ones in the spring applicationContext if the shape outputformat bean supports a charset parameter... the backside is with formatOptions you get all the available encodings, this way you limit them to a bunch...

Yeah, that would look nice and would be usable by most clients...
the issue is that the list of encodings should be user customizable,
there are lots out there. Not sure where we could put that... in a
parameter within web.xml maybe?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Yeah, that would look nice and would be usable by most clients...
the issue is that the list of encodings should be user customizable,
there are lots out there. Not sure where we could put that... in a
parameter within web.xml maybe?

Hmmm... adding another thing to web.xml for something that probably will not be used seems like make work to me. I mean if clients start asking it for by all means... but lets keep things simple first, adding complexity if we need to.

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

Yeah, that would look nice and would be usable by most clients...
the issue is that the list of encodings should be user customizable,
there are lots out there. Not sure where we could put that... in a
parameter within web.xml maybe?

Hmmm... adding another thing to web.xml for something that probably will not be used seems like make work to me. I mean if clients start asking it for by all means... but lets keep things simple first, adding complexity if we need to.

Well, I believe that the simplest thing is to actually advertise
just SHAPE-ZIP but recognize SHAPE-ZIP;encoding=.... in the requests.

Using format_options or allowing a comma separated list of items
in web.xml for the supported encodings have more or less the same
cost in my mind.

I would go for the first, as this issue actually impacts a limited
amount of users.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.