[GeoNetwork-devel] SMET (Spatial Metadata Extraction Tool)

Hi Byron,
Great to read about this! This is an area where lots of work still have to be done. Could you give a couple of lines of description of what you are working on?


> My own efforts are currently focused on developing an automated Spatial Metadata Extraction Tool I call SMET that will integrate with GeoNetwork.

Ciao,Jeroen

Hi Jeroen,

SMET is a tool I first started developing last Summer. It started a simple
interactive gui tool to extract the bounding box in lat long from a
geodataset that I could cut and past into a GeoNetwork metadata record. It
has since grown to include other extractable geographic metadata elements
such as projection information and pixel size (for images). I also use it
to extract UNC paths to provide a data location and create and modify dates.
The idea was to provide a tool that would automatically extract whatever
metadata it could from a geodataset and thereby speedup the capture of and
increase the accuracy of our metadata records.

This tool was built in python using ESRI GeoProcessing tools. I am now in
the process of replacing the ESRI calls with gdal/ogr. I have secured some
outside contracting to help me clean up and restructure my code and to
migrate to the gdal/ogr tools. This contract explicitly states that the
resulting code will be gpl. This contract concludes at the end of June by
which time the code should be cleaned up enough to be useful to others ;-).
I have not yet determined where or how to post it.

Currently, SMET spits out the metadata to a screen from which I copy and
paste into a metadata record. The next development step will be to spit
this out into xml . This will then be used to (somehow yet to be
determined) update existing or create new metadata records in GeoNetwork.
Further in the future I have thought that SMET would be able to be run as a
service/daemon and become something like the crawler tool that Francois
proposed last year.

The main concern that led to the creation of SMET was to simplify and
decrease errors in metadata collection. I also hope that it can help ease
maintenance and update task. It could also be developed to scan directories
for data discovery. The biggest unknown I have right now is what is the
best way to integrate this with GeoNetwork?

Cheers,
Byron

Jeroen Ticheler-3 wrote:

Hi Byron,
Great to read about this! This is an area where lots of work still
have to be done. Could you give a couple of lines of description of
what you are working on?

My own efforts are currently focused on developing an automated
Spatial Metadata Extraction Tool I call SMET that will integrate
with GeoNetwork.

Ciao,
Jeroen

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--
View this message in context: http://www.nabble.com/SMET-(Spatial-Metadata-Extraction-Tool)-tp17338058p17354749.html
Sent from the geonetwork-devel mailing list archive at Nabble.com.

Hi Byron,
I took the freedom to include some other people that have done or do work on just the same thing. Both Martin and Tyler developed a python based script that extracts metadata using gdal/ogr to generate records. Amit just joined us in FAO and will look into the scripts now over the coming weeks to come up with a working version.

I've written some bullet points on what I would think such a too should do to nicely work with GeoNetwork before on the GeoNetwork opensource Developer website site under RnD too. In short, I would hope to see a tool that generates metadata records in the background or using the command-line. It would browse a directory tree, possibly read out some default values from small text files or so that sit in the tree and would be used through inheritance. It would then generate the metadata XML files, possibly create thumbnails and generate an MEF file in the end. That file could be uploaded to GN, or maybe harvested if we add such function to GN.

This would really benefit from a SVN and discussion, because too many people have worked on this now using Python. Maybe one of the add-ons to GN that could have a separate section on the GN SVN!?

Ciao,
Jeroen

On May 21, 2008, at 4:27 AM, ByronC wrote:

Hi Jeroen,

SMET is a tool I first started developing last Summer. It started a simple
interactive gui tool to extract the bounding box in lat long from a
geodataset that I could cut and past into a GeoNetwork metadata record. It
has since grown to include other extractable geographic metadata elements
such as projection information and pixel size (for images). I also use it
to extract UNC paths to provide a data location and create and modify dates.
The idea was to provide a tool that would automatically extract whatever
metadata it could from a geodataset and thereby speedup the capture of and
increase the accuracy of our metadata records.

This tool was built in python using ESRI GeoProcessing tools. I am now in
the process of replacing the ESRI calls with gdal/ogr. I have secured some
outside contracting to help me clean up and restructure my code and to
migrate to the gdal/ogr tools. This contract explicitly states that the
resulting code will be gpl. This contract concludes at the end of June by
which time the code should be cleaned up enough to be useful to others ;-).
I have not yet determined where or how to post it.

Currently, SMET spits out the metadata to a screen from which I copy and
paste into a metadata record. The next development step will be to spit
this out into xml . This will then be used to (somehow yet to be
determined) update existing or create new metadata records in GeoNetwork.
Further in the future I have thought that SMET would be able to be run as a
service/daemon and become something like the crawler tool that Francois
proposed last year.

The main concern that led to the creation of SMET was to simplify and
decrease errors in metadata collection. I also hope that it can help ease
maintenance and update task. It could also be developed to scan directories
for data discovery. The biggest unknown I have right now is what is the
best way to integrate this with GeoNetwork?

Cheers,
Byron

Jeroen Ticheler-3 wrote:

Hi Byron,
Great to read about this! This is an area where lots of work still
have to be done. Could you give a couple of lines of description of
what you are working on?

My own efforts are currently focused on developing an automated
Spatial Metadata Extraction Tool I call SMET that will integrate
with GeoNetwork.

Ciao,
Jeroen

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--
View this message in context: http://www.nabble.com/SMET-(Spatial-Metadata-Extraction-Tool)-tp17338058p17354749.html
Sent from the geonetwork-devel mailing list archive at Nabble.com.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

All: Just so that you know, our gvSIG-metadata project is scheduled to
continue to a 3rd phase beginning (I hope) in September.

Among other tasks on our to-do list, is to work with a diverse team of
contributors to try to define MEF 2.0 format/functionality. Jeroen and I
spoke about this at a November'07 event here in Valencia.

By the way, the next gvSIG conference will coincide *directly* with the OGC
TC meeting (1-5 December 2008). We are hosting both at the Valencia Congress
Centre. Should be fun!

Cheers,
Mike

-----Mensaje original-----
De: Jeroen Ticheler [mailto:Jeroen.Ticheler@anonymised.com]
Enviado el: miércoles, 21 de mayo de 2008 8:20
Para: ByronC
CC: Amit Wadhwa; Tyler Mitchell; Martin Seiler; Michael Gould;
geonetwork-devel@lists.sourceforge.net
Asunto: Re: [GeoNetwork-devel] SMET (Spatial Metadata Extraction Tool)

Hi Byron,
I took the freedom to include some other people that have done or do
work on just the same thing. Both Martin and Tyler developed a python
based script that extracts metadata using gdal/ogr to generate
records. Amit just joined us in FAO and will look into the scripts now
over the coming weeks to come up with a working version.

I've written some bullet points on what I would think such a too
should do to nicely work with GeoNetwork before on the
http://trac.osgeo.org/geonetwork
  site under RnD too. In short, I would hope to see a tool that
generates metadata records in the background or using the command-
line. It would browse a directory tree, possibly read out some default
values from small text files or so that sit in the tree and would be
used through inheritance. It would then generate the metadata XML
files, possibly create thumbnails and generate an MEF file in the end.
That file could be uploaded to GN, or maybe harvested if we add such
function to GN.

This would really benefit from a SVN and discussion, because too many
people have worked on this now using Python. Maybe one of the add-ons
to GN that could have a separate section on the GN SVN!?

Ciao,
Jeroen

On May 21, 2008, at 4:27 AM, ByronC wrote:

Hi Jeroen,

SMET is a tool I first started developing last Summer. It started a
simple
interactive gui tool to extract the bounding box in lat long from a
geodataset that I could cut and past into a GeoNetwork metadata
record. It
has since grown to include other extractable geographic metadata
elements
such as projection information and pixel size (for images). I also
use it
to extract UNC paths to provide a data location and create and
modify dates.
The idea was to provide a tool that would automatically extract
whatever
metadata it could from a geodataset and thereby speedup the capture
of and
increase the accuracy of our metadata records.

This tool was built in python using ESRI GeoProcessing tools. I am
now in
the process of replacing the ESRI calls with gdal/ogr. I have
secured some
outside contracting to help me clean up and restructure my code and to
migrate to the gdal/ogr tools. This contract explicitly states that
the
resulting code will be gpl. This contract concludes at the end of
June by
which time the code should be cleaned up enough to be useful to
others ;-).
I have not yet determined where or how to post it.

Currently, SMET spits out the metadata to a screen from which I copy
and
paste into a metadata record. The next development step will be to
spit
this out into xml . This will then be used to (somehow yet to be
determined) update existing or create new metadata records in
GeoNetwork.
Further in the future I have thought that SMET would be able to be
run as a
service/daemon and become something like the crawler tool that
Francois
proposed last year.

The main concern that led to the creation of SMET was to simplify and
decrease errors in metadata collection. I also hope that it can
help ease
maintenance and update task. It could also be developed to scan
directories
for data discovery. The biggest unknown I have right now is what is
the
best way to integrate this with GeoNetwork?

Cheers,
Byron

Jeroen Ticheler-3 wrote:

Hi Byron,
Great to read about this! This is an area where lots of work still
have to be done. Could you give a couple of lines of description of
what you are working on?

My own efforts are currently focused on developing an automated
Spatial Metadata Extraction Tool I call SMET that will integrate
with GeoNetwork.

Ciao,
Jeroen

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--
View this message in context:

http://www.nabble.com/SMET-(Spatial-Metadata-Extraction-Tool)-tp17338058
p17354749.html

Sent from the geonetwork-devel mailing list archive at Nabble.com.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at

http://sourceforge.net/projects/geonetwork

dear Byron, Jeroen, all,
On Tue, May 20, 2008 at 07:27:35PM -0700, ByronC wrote:

This tool was built in python using ESRI GeoProcessing tools. I am now in
the process of replacing the ESRI calls with gdal/ogr.

Congratulations! :slight_smile:

Currently, SMET spits out the metadata to a screen from which I copy and
paste into a metadata record. The next development step will be to spit
this out into xml . This will then be used to (somehow yet to be
determined) update existing or create new metadata records in GeoNetwork.
Further in the future I have thought that SMET would be able to be run as a
service/daemon and become something like the crawler tool that Francois
proposed last year.

I've written something like this in the past, but it never really got
past the prototype stage, I got bogged down in edge cases - such as
what to do when running across data from tilesets before finding the
index to the tileset - and ran out of time.

What did survive from that project was a set of libraries to help spit
out metadata records into xml templates for "common" formats - the iso19115
representation used by GeoNetwork, 19139, fgdc, rdf. I'm setting up a
googlecode project and can point you at it, if you might find it useful.
I'd definitely rather use your work than rebuild the same thing :slight_smile:

The main concern that led to the creation of SMET was to simplify and
decrease errors in metadata collection. I also hope that it can help ease
maintenance and update task. It could also be developed to scan directories
for data discovery. The biggest unknown I have right now is what is the
best way to integrate this with GeoNetwork?

Last year when i talked about this with Jeroen he reckoned the best
way was to use the MEF format and the mef.import / mef.export methods
on GeoNetwork - essentially POSTing a .zip file to a HTTP address.
The zip contains XML data in one of the formats listed above, optional
thumbnails, and some meta-metadata in a format that GN defines.

Again i made a start on a simple MEF python package but stalled
because i didn't have a real clear use case for it at the time.
Also because there's a lot of very GN-specific meta-metadata in the
info.xml included in each MEF, and at the time there seemed to be
potential for MEF to become a more widely-used format for data
interchange - e.g. the gvSIG java desktop client was planning to use
MEF to send GN combinations of auto-extracted and hand-annotated
metadata, and they wanted to revise or formalise the "spec" a bit.
Michael Gould was organising this and it has been a long while since
we talked about it! It would be good to know what has changed.

jo
--

Hi all,

On mer, 2008-05-21 at 10:06 +0200, michael gould wrote:

Among other tasks on our to-do list, is to work with a diverse team of
contributors to try to define MEF 2.0 format/functionality. Jeroen and I
spoke about this at a November'07 event here in Valencia.

We're also going to work on that for GeoSource in France. It could be
nice to have a web page on the trac to discuss a MEF2.0 format ? One
requirement we have for example is to have more than one metadata in a
MEF file ... Do you already define any spec?

> Further in the future I have thought that SMET would be able to be
> run as a
> service/daemon and become something like the crawler tool that
> Francois
> proposed last year.

For the Talend Spatial project we also added components in order to
create metadata in XML Format, create MEF File, publish to GeoNetwork :
http://www.talendforge.org/wiki/doku.php?id=sdi:geocomponentslist#metadata_convert_create_publish_metadata

Use case for a pdf map metadata archive published in GeoNetwork
http://www.talendforge.org/wiki/doku.php?id=sdi:examples#generate_iso19139_metadata_for_a_pdf_map_archive_and_publish_to_the_catalogue

This is nice to have more and more tools able to publish and interact
with the catalogue. That's make metadata more "exciting" than just fill
in metadata in an editor :slight_smile:

Ciao. Francois

Our idea is that it be a consensus format...we do not plan to do it
ourselves.

Now, we need to be careful because often consensus formats/specs grow and
grow...until they become useless :slight_smile:

Using a wiki or similar is a good idea.

I would like to hear Jeroen's opinion on how best top move ahead.

Mike
www.geoinfo.uji.es

-----Mensaje original-----
De: Francois-Xavier Prunayre
[mailto:francois-xavier.prunayre@anonymised.com]
Enviado el: miércoles, 21 de mayo de 2008 15:46
Para: michael gould; 'Jeroen Ticheler'; 'ByronC'
CC: 'Tyler Mitchell'; 'Amit Wadhwa'; geonetwork-devel@lists.sourceforge.net
Asunto: Re: [GeoNetwork-devel] SMET (Spatial Metadata Extraction Tool)

Hi all,

On mer, 2008-05-21 at 10:06 +0200, michael gould wrote:

Among other tasks on our to-do list, is to work with a diverse team of
contributors to try to define MEF 2.0 format/functionality. Jeroen and I
spoke about this at a November'07 event here in Valencia.

We're also going to work on that for GeoSource in France. It could be
nice to have a web page on the trac to discuss a MEF2.0 format ? One
requirement we have for example is to have more than one metadata in a
MEF file ... Do you already define any spec?

> Further in the future I have thought that SMET would be able to be
> run as a
> service/daemon and become something like the crawler tool that
> Francois
> proposed last year.

For the Talend Spatial project we also added components in order to
create metadata in XML Format, create MEF File, publish to GeoNetwork :
http://www.talendforge.org/wiki/doku.php?id=sdi:geocomponentslist#metadata_c
onvert_create_publish_metadata

Use case for a pdf map metadata archive published in GeoNetwork
http://www.talendforge.org/wiki/doku.php?id=sdi:examples#generate_iso19139_m
etadata_for_a_pdf_map_archive_and_publish_to_the_catalogue

This is nice to have more and more tools able to publish and interact
with the catalogue. That's make metadata more "exciting" than just fill
in metadata in an editor :slight_smile:

Ciao. Francois

On 21-May-08, at 6:46 AM, Francois-Xavier Prunayre wrote:

On mer, 2008-05-21 at 10:06 +0200, michael gould wrote:

Among other tasks on our to-do list, is to work with a diverse team of
contributors to try to define MEF 2.0 format/functionality. Jeroen and I
spoke about this at a November'07 event here in Valencia.

We're also going to work on that for GeoSource in France. It could be
nice to have a web page on the trac to discuss a MEF2.0 format ? One
requirement we have for example is to have more than one metadata in a
MEF file ... Do you already define any spec?

I definitely am interested in implementing the MEF spec. I also wouldn't mind seeing a collection of tools being managed in the GN SVN somehow, so I don't have to maintain my Python crawler:

http://code.google.com/p/spatialguru/source/browse/trunk/nme/cat/

Output example:
http://code.google.com/p/spatialguru/source/browse/trunk/nme/cat/sample_output.xml

I also wonder if perhaps we need an extension to existing metadata specifications to handle some of the lower-level attributes about datasets - items beyond the bounding box and ownership info. For example field types, feature counts, file system info, etc. I've been putting these into an XML output already, but not following any particular spec since I haven't found one. These are items that I see would be very helpful for programmatic access to datasets.

Tyler

Tyler and all,

Our gvSIG metadata manager (v1, to appear some day) does indeed contemplate
both internal metadata and the more general external metadata i.e. 19115 for
discovery.

And we agree that MEF 2.0 should allow storage of both the general external
MD and also specific internal MD. The internal packages might end up
illegible for other clients/users, but that's ok, and something for the
market to figure out.

Another angle we'll investigate is how MEF and KML might interact, as the
latter is (geo)indexable in Google.

Mike

-----Mensaje original-----
De: Tyler Mitchell [mailto:tmitchell.osgeo@anonymised.com] En nombre de Tyler
Mitchell (OSGeo)
Enviado el: miércoles, 21 de mayo de 2008 17:22
Para: francois-xavier.prunayre@anonymised.com
CC: michael gould; 'Jeroen Ticheler'; 'ByronC'; 'Amit Wadhwa';
geonetwork-devel@lists.sourceforge.net
Asunto: Re: [GeoNetwork-devel] SMET (Spatial Metadata Extraction Tool)

On 21-May-08, at 6:46 AM, Francois-Xavier Prunayre wrote:

On mer, 2008-05-21 at 10:06 +0200, michael gould wrote:

Among other tasks on our to-do list, is to work with a diverse
team of
contributors to try to define MEF 2.0 format/functionality. Jeroen
and I
spoke about this at a November'07 event here in Valencia.

We're also going to work on that for GeoSource in France. It could be
nice to have a web page on the trac to discuss a MEF2.0 format ? One
requirement we have for example is to have more than one metadata in a
MEF file ... Do you already define any spec?

I definitely am interested in implementing the MEF spec. I also
wouldn't mind seeing a collection of tools being managed in the GN
SVN somehow, so I don't have to maintain my Python crawler:

http://code.google.com/p/spatialguru/source/browse/trunk/nme/cat/

Output example:
http://code.google.com/p/spatialguru/source/browse/trunk/nme/cat/
sample_output.xml

I also wonder if perhaps we need an extension to existing metadata
specifications to handle some of the lower-level attributes about
datasets - items beyond the bounding box and ownership info. For
example field types, feature counts, file system info, etc. I've
been putting these into an XML output already, but not following any
particular spec since I haven't found one. These are items that I
see would be very helpful for programmatic access to datasets.

Tyler

Hi All,

I am very new to GeoNetwork. I started working in GI sciences when
last year I done my Master's project on OGC web service and service
oriented science/architecture. Presently I work with HP on Storage
area networks. But still I am actively engaged in research with my
professor in the same area as I worked at my master's project.

I have little understanding about details of GeoNetwork since I have
not yet explored it complately nor I have gone through the code it
much details (this is because for next 1 month I am busy).

What interests me most about this particular discussion is the
automatic extraction of metadata. In one of my Journal paper which
essential talks about service oriented architecture of an SDI, I have
advocated for automatic metadata generation. I named the component in
our prototype as MetedataGenerator, whose basic functionality was to
extract metadata and put it in to an XML file according to the XML
schema. The metadata here was not according to 19115 because I didn't
have the access to these standards when I started working on it.

The fact is, India being a developing country we some time have
funding problem and we can not always go and buy these standard any
time we want, well anyways that still does not stops me to come up
with new ideas and try to implement them.

I am to much excited about all the talks happening here an am trying
to understanding more details about GeoNetwoek code and also I am
trying to understanding about MEF files and gvSIG and related stuff
also the code of GeoNetwork. After a month(my exam, I am looking from
PhD fellowship) I would like to get more involved in this activity and
would like try to contribute as well.

I apology that in compare to people discussing this topic I have am
very new and have little understanding of GeoNetwoek, but at some
point every one is at starting so for me this can be a really good
starting point.

I would require your motivation and help in order to build my sound
understanding about the details of GeoNetwok.

Thanks and Regards,
Hiren C Bhatt

On 5/21/08, Francois-Xavier Prunayre
<francois-xavier.prunayre@anonymised.com> wrote:

Hi all,

On mer, 2008-05-21 at 10:06 +0200, michael gould wrote:
> Among other tasks on our to-do list, is to work with a diverse team of
> contributors to try to define MEF 2.0 format/functionality. Jeroen and I
> spoke about this at a November'07 event here in Valencia.
>

We're also going to work on that for GeoSource in France. It could be
nice to have a web page on the trac to discuss a MEF2.0 format ? One
requirement we have for example is to have more than one metadata in a
MEF file ... Do you already define any spec?

> > Further in the future I have thought that SMET would be able to be
> > run as a
> > service/daemon and become something like the crawler tool that
> > Francois
> > proposed last year.

For the Talend Spatial project we also added components in order to
create metadata in XML Format, create MEF File, publish to GeoNetwork :
http://www.talendforge.org/wiki/doku.php?id=sdi:geocomponentslist#metadata_convert_create_publish_metadata

Use case for a pdf map metadata archive published in GeoNetwork
http://www.talendforge.org/wiki/doku.php?id=sdi:examples#generate_iso19139_metadata_for_a_pdf_map_archive_and_publish_to_the_catalogue

This is nice to have more and more tools able to publish and interact
with the catalogue. That's make metadata more "exciting" than just fill
in metadata in an editor :slight_smile:

Ciao. Francois

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork