[GeoNetwork-devel] Proposal to add mime type calculation and indexing?

Hi All,

I've created a proposal to add mime type calculation and indexing to GeoNetwork trunk (2.5 Unstable) at:

http://trac.osgeo.org/geonetwork/wiki/MimeTypeCalculationIndexing

A patch is attached to the proposal.

Some later work to be done to replace other mime type calculation code in the BinaryFile class in Jeeves and there is some easy work on the metadata xslts (eg. metadata-iso19139.xsl) to use the calculated mime type (see template iso19139Brief in metadata-iso19139.xsl for example) which can be done as part of this proposal but it isn't in the patch.

I'd like to get the proposal up for 2.5 but there really isn't a place to put the mime type into an ISO metadata record - it doesn't seem to fit anywhere really in the onlineResource section - which means it needs to be recalculated/or retrieved from the index when required and wouldn't be present in any record returned from a search (a bit like a geonetwork category). At present the patch just leaves it out (BlueNetMEST bends an attribute of the parent CI_OnlineResource a little too far). Any thoughts?

Cheers,
Simon

Hello Simon,

2010/4/14 <Simon.Pigot@anonymised.com>:

Hi All,

I've created a proposal to add mime type calculation and indexing to GeoNetwork trunk (2.5 Unstable) at:

http://trac.osgeo.org/geonetwork/wiki/MimeTypeCalculationIndexing

A patch is attached to the proposal.

Some later work to be done to replace other mime type calculation code in the BinaryFile class in Jeeves and there is some easy work on the metadata xslts (eg. metadata-iso19139.xsl) to use the calculated mime type (see template iso19139Brief in metadata-iso19139.xsl for example) which can be done as part of this proposal but it isn't in the patch.

I'd like to get the proposal up for 2.5 but there really isn't a place to put the mime type into an ISO metadata record - it doesn't seem to fit anywhere really in the onlineResource section - which means it needs to be recalculated/or retrieved from the index when required and wouldn't be present in any record returned from a search (a bit like a geonetwork category). At present the patch just leaves it out (BlueNetMEST bends an attribute of the parent CI_OnlineResource a little too far). Any thoughts?

When working on WFS indexing, I thought about having a background
indexing task which takes care of improving the index content.
You could register some processes which could analyse resources
related to the metadata record (eg. documents mime-type or content,
WFS, Shapefile content) and add the results to the index (for full
text search or specific parameters).
On every updates, we could add the record to a list and the task could
update the index on a regular basis and does not slow the actual
indexing step on updates. Then we will probably have to do an update
of the index instead of the replace action we do now.

Instead of using the uuid (which does not look to be the right place).
I would suggest 2 options :
* use the protocol instead - more or less like we do with the list
* use the gmx:MimeFileType element (maybe a better option in order to
be displayed in edit / view mode and to be indexed)

Then you could write something like
<protocol>
  <MimeFileType xmlns="http://www.isotc211.org/2005/gmx&quot; type="text/html"/>
</protocol>

To turn this on in the editor, you could use the following in
schema-suggestion.xml :
  <field name="gmd:protocol">
    <suggest name="gco:CharacterString"/>
    <suggest name="gmx:MimeFileType"/>
  </field>
  
  <field name="gmx:MimeFileType">
    <suggest name="type"/>
  </field>

and remove the gmd:protocol suggestion from gmd:CI_OnlineResource.

Then you have the choice in the editor. You just need a template to
handle gmx:MimeFileType.

Ciao.

Francois

Cheers,
Simon
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Looks to me like it makes more sense to put the mime file type as CI_OnlineResource/gmd:name/gmx:MimeFileType. Then the element value is the file name, and the mime type is the type for that online resource. Function should be download in this case..., protocol would be 'http' or 'ftp' I would think... Too bad the ISO standard is so muddled on this...

steve

On 4/14/2010 9:40 AM, Francois Prunayre wrote:

Hello Simon,

2010/4/14<Simon.Pigot@anonymised.com>:
   

Hi All,

I've created a proposal to add mime type calculation and indexing to GeoNetwork trunk (2.5 Unstable) at:

snip...

Instead of using the uuid (which does not look to be the right place).
I would suggest 2 options :
  * use the protocol instead - more or less like we do with the list
  * use the gmx:MimeFileType element (maybe a better option in order to
be displayed in edit / view mode and to be indexed)

Then you could write something like
<protocol>
   <MimeFileType xmlns="http://www.isotc211.org/2005/gmx&quot; type="text/html"/>
</protocol>

To turn this on in the editor, you could use the following in
schema-suggestion.xml :
  <field name="gmd:protocol">
    <suggest name="gco:CharacterString"/>
    <suggest name="gmx:MimeFileType"/>
  </field>
  
  <field name="gmx:MimeFileType">
    <suggest name="type"/>
  </field>

and remove the gmd:protocol suggestion from gmd:CI_OnlineResource.

Then you have the choice in the editor. You just need a template to
handle gmx:MimeFileType.

--
Stephen M. Richard
Section Chief, Geoinformatics
Arizona Geological Survey
416 W. Congress St., #100
Tucson, Arizona, 85701 USA

Phone:
Office: (520) 209-4127
Reception: (520) 770-3500
FAX: (520) 770-3505

email: steve.richard@anonymised.com

Thanks Steve and Francois!

Stephen M Richard wrote:

Looks to me like it makes more sense to put the mime file type as CI_OnlineResource/gmd:name/gmx:MimeFileType. Then the element value is the file name, and the mime type is the type for that online resource. Function should be download in this case..., protocol would be 'http' or 'ftp' I would think... Too bad the ISO standard is so muddled on this...

steve

On 4/14/2010 9:40 AM, Francois Prunayre wrote:
  

Hello Simon,

2010/4/14<Simon.Pigot@anonymised.com>:
   

Hi All,

I've created a proposal to add mime type calculation and indexing to GeoNetwork trunk (2.5 Unstable) at:

snip...
  

Instead of using the uuid (which does not look to be the right place).
I would suggest 2 options :
  * use the protocol instead - more or less like we do with the list
  * use the gmx:MimeFileType element (maybe a better option in order to
be displayed in edit / view mode and to be indexed)

Then you could write something like
<protocol>
   <MimeFileType xmlns="http://www.isotc211.org/2005/gmx&quot; type="text/html"/>
</protocol>

To turn this on in the editor, you could use the following in
schema-suggestion.xml :
  <field name="gmd:protocol">
    <suggest name="gco:CharacterString"/>
    <suggest name="gmx:MimeFileType"/>
  </field>
  
  <field name="gmx:MimeFileType">
    <suggest name="type"/>
  </field>

and remove the gmd:protocol suggestion from gmd:CI_OnlineResource.

Then you have the choice in the editor. You just need a template to
handle gmx:MimeFileType.