[GeoNetwork-users] [GeoNetwork-devel] Updating Records en-mass

Hi David,

Yes, I expect we will soon be in a similar situation to yours i.e a need
for bulk updates of particular records and bits of infomation.

It would be good if there were some practical guides on
how to do such bulk updates. I recall some talk about this
previously on the GN users or devel list. Perhaps worth digging out those
emails.

Andrew

----- Original Message ----- From: "David Sampson" <david.sampson@anonymised.com>
To: <geonetwork-devel@lists.sourceforge.net>
Sent: Wednesday, October 27, 2010 23:34
Subject: [GeoNetwork-devel] Updating Records en-mass

Hey Folks,

I have been using GN for a numbe rof years now and have helped many
people get up and started for their individual metadata management
needs. Now GN is being put to task on managing metadata at a corporate
level. For this I require some advice and guidance.

I am looking for some guidance to help metadata editors edit their
records en-mass. Our current organization has many independent installs
of GN. This is fine when edits involve 5-10 records.

There is a resistance however for installing a central GN to manage all
the metadata and have it on enterprise hardware due to a few key missing
tools / capabilities. Perhaps this is an issue of perception or lack of
know how.

Please find bellow some use cases that are offered up as key reasons not
to move forward with more GN installs or choosing GN as a centralized
catalogue. I have also heard of similar needs from many other Government
departments. So maybe we can start understanding the issues a bit more.

Here are a few use cases:

1. The name, address, phone number, email of a/the contact for metadata
changes. The organization needs to edit 200 records to update this
information.

2. An FTP site is going to be moved to a new domain. 1000 associated
records need to be edited to reflect the changed domain for the FTP in
the metadata.

3. A collection of 20,000 products all start with the same metadata. A
process was later developed to provide better descriptive metadata. This
would require replacing all 20,000 records with a new purpose, title,
and abstract while maintaining the rest of the record. Therefore taking
one record duplicated 20,000 times and updating these records to become
20,000 separate and unique metadata records.

4. A new keyword list is introduced to an organization and the 100,000
associated metadata records need to be updated with new and additional
keywords. First keywords need to be found in free text search. Then a
keyword cross walk identifies other trigger words to be mapped to
keywords. Finally new keywords need to be associated with relevant
records.

5. User wishes to use the interface to perform a search for all their
metadta records. After returning an exact search result they want to
systematically update all the records with the same information.

Requirements:

* This must be done in a programmatic way (eg php, java, python)
* The records with their original UUID are preferred over creating new
records. Thus an update is required.
* This should be able to be done on a local workstation or server side.
* This process should be governed with a permission structure so not
just anyone can edit these records. Either controlled by authentication
or IP.

Resources:

* I am aware of the CSW spec (as we promote OGC specs), but never used
the transaction functions. Do these need to be opened up manualy on GN
(http://geonetwork-opensource.org/stable/developers/xml_services/csw_services.html#transaction). it is unclear how the sample POST with filter will replace a given value. For instance what will it replace the "Eurasia" titled record with? Will it replace all instances or occurances or is this a full title search?

* the XML services might be another candidate for this:
http://geonetwork-opensource.org/latest/developers/xml_services/index.html

* The update XML service looks promising
(http://geonetwork-opensource.org/latest/developers/xml_services/metadata_xml_services.html#metadata-editing). However it looks like it does one record at a time. So for the examples above does this mean you have to repeat one request for each record? With this slow down the system with 100,000+ records?

Cheers

--
Dave Sampson <david.sampson@anonymised.com>

Natural Resources Canada
Earth Sciences Sector (http://ess.nrcan.gc.ca)
Mapping Information Branch
Geospatial Systems and Applications
CGDI Technology Analyst
601 Booth Street, Ottawa, Ontario, Canada

Please consider the environment before printing this e-mail

------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork