[GeoNetwork-devel] MEF creation and import with Python?

Did anybody attempt creating and loading MEF files to GeoNetwork 2.4.2 (GN) in Pyhton?
I would love to see your code and learn from it.

I am going to follow Simon Pigot's suggestion and use MEF files to load our ISO 19139 metadata records to GN instead of CSW insert/update transactions because of the following reasons:
1) There appears to be no easy way to automatically assign GN ownership/priviliges to CSW transactions (HTTP Post)
2) GN freezes after ca. 4,500 CSW insert transactions. I do not know what the problem is but I can replicate the problem on our Windows and Linux machine.
3) I hope that SFTPing MEF files to the server and local metadata loading will be less bandwidth and processing intensive than CSW transactions.

Ciao, Wolfgang

--
_______________________________
Wolfgang Grunberg
Arizona Geological Survey
wgrunberg@anonymised.com
520-770-3500

Wolfgang Grunberg wrote:

<snip>

2) GN freezes after ca. 4,500 CSW insert transactions. I do not know what the problem is but I can replicate the problem on our Windows and Linux machine

<snip>

Hi Wolfgang,

I've happily inserted nearly 20,000 records using csw insert transactions without freezes. My test script inserts records with randomly generated bounding boxes using curl. Certainly the first few thousand are inserted faster than the rest but my (not surprising) observation is that performance seems to degrade due to indexing and database costs (needs more investigation to work out whether this can be improved).

Using GeoNetwork 2.4.2, mysql5 db, Linux (ubuntu) 64bit, XFS filesystem, what db are you using?

Cheers,
Simon

Hi Simon,

I am also still using Mckoi for both of our servers and probably should try it with our PostgreSQL RDBMS
- Test server: GN 2.4.2, Windows XP, Tomcat 5.5, NTSF file system, 3GB RAM, core 2 duo;
- Production environment: GN 2.4.2, Debian Etch, Tomcat 5.5, Amazon EC2 small virtual server (1.7 GB RAM, 1 virtual core)

I am signing in and out for each insert transaction of ca. 5,800 real metadata records. Could all those authentication requests cause problems?

Is there a way to remotely request a Lucene index rebuild and will GN send a return message when done? Then, I could rebuild the index every 1,000 records or so.

Ciao, Wolfgang

_______________________________
Wolfgang Grunberg
Arizona Geological Survey
wgrunberg@anonymised.com
520-770-3500

Simon Pigot wrote:

Wolfgang Grunberg wrote:

<snip>

2) GN freezes after ca. 4,500 CSW insert transactions. I do not know what the problem is but I can replicate the problem on our Windows and Linux machine

<snip>

Hi Wolfgang,

I've happily inserted nearly 20,000 records using csw insert transactions without freezes. My test script inserts records with randomly generated bounding boxes using curl. Certainly the first few thousand are inserted faster than the rest but my (not surprising) observation is that performance seems to degrade due to indexing and database costs (needs more investigation to work out whether this can be improved).

Using GeoNetwork 2.4.2, mysql5 db, Linux (ubuntu) 64bit, XFS filesystem, what db are you using?

Cheers,
Simon

Hi Simon,

I noticed that GeoNetwork (GN) harvest (Harvesting Management | OGC WxS ) fails while I am inserting CSW records.
However, it appears that running harvest does not interfere with CSW insert transactions.

Now I am testing the following possible reasons for my past CSW insert failures:

  1. Multiple CSW insert runs (6000 records) and subsequent deletes of all records without re-indexing.
  2. Inserting with various GN log levels. My failed CSW insert runs were using a “debug” log level. So far, I had no errors while using the “inform” log level.

Ciao, Wolfgang

Wolfgang Grunberg wrote:

Hi Wolfgang,

I suspect the failure of harvesting during transaction inserts is due to using McKoi as your db - presumably the harvest was failing with a concurrent serializable transaction exception? If so this is the same error Doug was turning up in ticket #163. I don't think McKoi should be used for anything serious or in anger.

Good to here that you can insert more than 4500 records! I altered my test script to login with each insert so that the behaviour was similar to yours but it didn't have any unexpected consequences, I was still able to insert the 20,000 test records.

Cheers,
Simon

Wolfgang Grunberg wrote:

Hi Simon,

I noticed that GeoNetwork (GN) harvest (Harvesting Management | OGC WxS ) fails while I am inserting CSW records.
However, it appears that running harvest does not interfere with CSW insert transactions.

Now I am testing the following possible reasons for my past CSW insert failures:
1) Multiple CSW insert runs (6000 records) and subsequent deletes of all records without re-indexing.
2) Inserting with various GN log levels. My failed CSW insert runs were using a "debug" log level. So far, I had no errors while using the "inform" log level.

Ciao, Wolfgang

Wolfgang Grunberg wrote:

Hi Simon,

I am also still using Mckoi for both of our servers and probably should try it with our PostgreSQL RDBMS
- Test server: GN 2.4.2, Windows XP, Tomcat 5.5, NTSF file system, 3GB RAM, core 2 duo;
- Production environment: GN 2.4.2, Debian Etch, Tomcat 5.5, Amazon EC2 small virtual server (1.7 GB RAM, 1 virtual core)

I am signing in and out for each insert transaction of ca. 5,800 real metadata records. Could all those authentication requests cause problems?

Is there a way to remotely request a Lucene index rebuild and will GN send a return message when done? Then, I could rebuild the index every 1,000 records or so.

Ciao, Wolfgang

_______________________________
Wolfgang Grunberg
Arizona Geological Survey
wgrunberg@anonymised.com
520-770-3500

Simon Pigot wrote:
  

Wolfgang Grunberg wrote:

<snip>
    

2) GN freezes after ca. 4,500 CSW insert transactions. I do not know what the problem is but I can replicate the problem on our Windows and Linux machine
      

<snip>

Hi Wolfgang,

I've happily inserted nearly 20,000 records using csw insert transactions without freezes. My test script inserts records with randomly generated bounding boxes using curl. Certainly the first few thousand are inserted faster than the rest but my (not surprising) observation is that performance seems to degrade due to indexing and database costs (needs more investigation to work out whether this can be improved).

Using GeoNetwork 2.4.2, mysql5 db, Linux (ubuntu) 64bit, XFS filesystem, what db are you using?

Cheers,
Simon

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now!
Best Open Source Mac Front-Ends 2024
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
geonetwork-devel List Signup and Options
GeoNetwork OpenSource is maintained at GeoNetwork - Geographic Metadata Catalog download | SourceForge.net

------------------------------------------------------------------------

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork