[GeoNetwork-devel] Lucene index rebuild failure and metadata corruption - how to debug?

Hello,

According to a pop-up message, GM 2.4.2's (Ubuntu with Tomcat 6 and MySQL) recent index rebuild failed.
The geonetwork log (DEBUG level) only tells me:
    "2009-11-12 19:36:18,031 ERROR [geonetwork.search] - Rebuilding lucene index"
However, Tomcat is still working hard and MySQL is called once in a while.

When I reboot tomcat, it takes GN about 32 minutes to boot up as it "ignores" a couple thousand metadata records due to "corruption":
    "2009-11-12 17:53:11,555 ERROR [geonetwork.datamanager] - The metadata with id=8381 is corrupt/invalid - ignoring it"
Strangely, all OGC WxS metadata records GN harvests now trigger a metadata corruption error too. They worked before.
I have spot checked some of the corrupt ISO 19139 metadata records (loaded through MEF imports) in MySQL and they are schema valid.
I have also loaded the MEF files in question to my GN test server on Windows and have rebuild the index without any errors.

Could somebody give me suggestions on how to debug Lucene and what could cause the metadata corruption error?

Thanks!
WG

--
_______________________________
Wolfgang Grunberg
Arizona Geological Survey
wgrunberg@anonymised.com
520-770-3500

Apparently GN was rebuilding the index regardless of all those Lucene errors. Once Tomcat calmed down, I rebooted it and the "corrupted" metadata records showed up again.

Still, it would be nice to figure out why I am getting the error messages.

Ciao, WG

Wolfgang Grunberg wrote:

Hello,

According to a pop-up message, GM 2.4.2's (Ubuntu with Tomcat 6 and MySQL) recent index rebuild failed.
The geonetwork log (DEBUG level) only tells me:
    "2009-11-12 19:36:18,031 ERROR [geonetwork.search] - Rebuilding lucene index"
However, Tomcat is still working hard and MySQL is called once in a while.

When I reboot tomcat, it takes GN about 32 minutes to boot up as it "ignores" a couple thousand metadata records due to "corruption":
    "2009-11-12 17:53:11,555 ERROR [geonetwork.datamanager] - The metadata with id=8381 is corrupt/invalid - ignoring it"
Strangely, all OGC WxS metadata records GN harvests now trigger a metadata corruption error too. They worked before.
I have spot checked some of the corrupt ISO 19139 metadata records (loaded through MEF imports) in MySQL and they are schema valid.
I have also loaded the MEF files in question to my GN test server on Windows and have rebuild the index without any errors.

Could somebody give me suggestions on how to debug Lucene and what could cause the metadata corruption error?

Thanks!
WG