I am hoping for some community insight.
I reported an apparent harvester bug on Feb 7 -> #3552. GN harvests most but not all filtered records from a remote server. Unsuccessfully harvested records are logged in the harvester with a false error message (often 0x0 character at a certain line, but examination of the vicinity with a hex editor reveals no such error). The remote records being harvested had been validated.
I recently revisited the problem, using the metadata record "Strait of Georgia Synoptic Bottom Trawl Survey, 2012-2015" from the remote server http://soggy.zoology.ubc.ca:8080/geonetwork, with harvester filter on title as "synoptic; trawl" (no quotes). The relevant portion of the harvester log file states:
ERROR [psfDataCentre] - Error occurred while trying to load an xml file: /d0e11093-8990-4552-8616-a1169e5c50ee/metadata/metadata.xml: Error on line 999: An invalid XML character (Unicode: 0x0) was found in the element content of the document.
org.jdom.JDOMException: Error occurred while trying to load an xml file: /d0e11093-8990-4552-8616-a1169e5c50ee/metadata/metadata.xml: Error on line 999: An invalid XML character (Unicode: 0x0) was found in the element content of the document.
If I download a "mef" for the same record, it too will not import directly to my local GN, with the same error message.
However, if I download the metadata as "xml", or if I extract the metadata.xml from the mef that would not load, the record loads into my local GN fine. So - it might be something with the mef format, but nothing I can identify.
I have examined the downloaded or extracted metadata record with a hex editor, and there is no 0x0 anywhere near line 999, or even in the file. The metadata record also validates perfectly.
I have another dozen or more records (about 3% of about 650 records) that behave similarly. All the rest harvest normally.
Remote GN server is version 3.4.3, and local server is version 3.8.1.
I would appreciate any suggestions.