[GeoNetwork-users] Unable to harvest some metadata from a GeoNetwork instance

Hi list,

I'm trying to harvest the metadata from a 3.8 geonetwork to a 3.10.1 but
most of the metadata is skipped with the message:
<<Skipped unretrievable metadata (maybe has been removed) >>

Looking at the source geonetwork instance the metadata is there but has
some errors, some time on ISO rules, some times on the schema validation.

I'm expecting the validation should be skipped as specified in the
harvester configuration:
Validate records before import
Accept all metadata without validation

then the logs (see below):
<< Missing scheme >>

Log:
-------------------------
2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped unretrievable
metadata (maybe has been removed) with
uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
java.lang.IllegalArgumentException: Missing scheme
at java.nio.file.Paths.get(Paths.java:134)
at
com.sun.nio.zipfs.ZipFileSystemProvider.uriToPath(ZipFileSystemProvider.java:85)
at
com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:166)
at java.nio.file.FileSystems.getFileSystem(FileSystems.java:221)
at org.fao.geonet.ZipUtil.getOrCreateZipFs(ZipUtil.java:88)
at org.fao.geonet.ZipUtil.openZipFs(ZipUtil.java:81)
at org.fao.geonet.kernel.mef.MEFLib.getMEFVersion(MEFLib.java:169)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.addMetadata(Aligner.java:382)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.align(Aligner.java:223)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Harvester.harvest(Harvester.java:230)
at
org.fao.geonet.kernel.harvest.harvester.geonet.GeonetHarvester.doHarvest(GeonetHarvester.java:95)
at
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester$HarvestWithIndexProcessor.process(AbstractHarvester.java:568)
at
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester.harvest(AbstractHarvester.java:639)
at
org.fao.geonet.kernel.harvest.harvester.HarvesterJob.execute(HarvesterJob.java:69)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
---------------------

Can someone clarify what happens and why I'm not able to import metadata
ignoring validation errors?

Cheers,
Carlo

Dear All,

could someone help me to understand the problem or what I'm missing?

Thank you,
C.

Il giorno mer 26 feb 2020 alle ore 12:51 carlo cancellieri <
geo.ccancellieri@anonymised.com> ha scritto:

Hi list,

I'm trying to harvest the metadata from a 3.8 geonetwork to a 3.10.1 but
most of the metadata is skipped with the message:
<<Skipped unretrievable metadata (maybe has been removed) >>

Looking at the source geonetwork instance the metadata is there but has
some errors, some time on ISO rules, some times on the schema validation.

I'm expecting the validation should be skipped as specified in the
harvester configuration:
Validate records before import
*Accept all metadata without validation *

then the logs (see below):
<< Missing scheme >>

Log:
-------------------------
2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped unretrievable
metadata (maybe has been removed) with
uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
java.lang.IllegalArgumentException: Missing scheme
at java.nio.file.Paths.get(Paths.java:134)
at
com.sun.nio.zipfs.ZipFileSystemProvider.uriToPath(ZipFileSystemProvider.java:85)
at
com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:166)
at java.nio.file.FileSystems.getFileSystem(FileSystems.java:221)
at org.fao.geonet.ZipUtil.getOrCreateZipFs(ZipUtil.java:88)
at org.fao.geonet.ZipUtil.openZipFs(ZipUtil.java:81)
at org.fao.geonet.kernel.mef.MEFLib.getMEFVersion(MEFLib.java:169)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.addMetadata(Aligner.java:382)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.align(Aligner.java:223)
at
org.fao.geonet.kernel.harvest.harvester.geonet.Harvester.harvest(Harvester.java:230)
at
org.fao.geonet.kernel.harvest.harvester.geonet.GeonetHarvester.doHarvest(GeonetHarvester.java:95)
at
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester$HarvestWithIndexProcessor.process(AbstractHarvester.java:568)
at
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester.harvest(AbstractHarvester.java:639)
at
org.fao.geonet.kernel.harvest.harvester.HarvesterJob.execute(HarvesterJob.java:69)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
---------------------

Can someone clarify what happens and why I'm not able to import metadata
ignoring validation errors?

Cheers,
Carlo

--
Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

Hi Carlo

Not really sure, but from the log snippet seem the metadata you're trying
to import is not recognised:

2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped unretrievable
metadata (maybe has been removed) with
uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
java.lang.IllegalArgumentException: Missing scheme

Which harvester are you using, CSW or GeoNetwork harvester?

What is the format of the metadata?

Regards,
Jose García

On Mon, Mar 2, 2020 at 2:09 PM carlo cancellieri <geo.ccancellieri@anonymised.com>
wrote:

Dear All,

could someone help me to understand the problem or what I'm missing?

Thank you,
C.

Il giorno mer 26 feb 2020 alle ore 12:51 carlo cancellieri <
geo.ccancellieri@anonymised.com> ha scritto:

> Hi list,
>
> I'm trying to harvest the metadata from a 3.8 geonetwork to a 3.10.1 but
> most of the metadata is skipped with the message:
> <<Skipped unretrievable metadata (maybe has been removed) >>
>
> Looking at the source geonetwork instance the metadata is there but has
> some errors, some time on ISO rules, some times on the schema validation.
>
> I'm expecting the validation should be skipped as specified in the
> harvester configuration:
> Validate records before import
> *Accept all metadata without validation *
>
> then the logs (see below):
> << Missing scheme >>
>
> Log:
> -------------------------
> 2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped unretrievable
> metadata (maybe has been removed) with
> uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
> 2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
> java.lang.IllegalArgumentException: Missing scheme
> at java.nio.file.Paths.get(Paths.java:134)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.uriToPath(ZipFileSystemProvider.java:85)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:166)
> at java.nio.file.FileSystems.getFileSystem(FileSystems.java:221)
> at org.fao.geonet.ZipUtil.getOrCreateZipFs(ZipUtil.java:88)
> at org.fao.geonet.ZipUtil.openZipFs(ZipUtil.java:81)
> at org.fao.geonet.kernel.mef.MEFLib.getMEFVersion(MEFLib.java:169)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.addMetadata(Aligner.java:382)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.align(Aligner.java:223)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Harvester.harvest(Harvester.java:230)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.GeonetHarvester.doHarvest(GeonetHarvester.java:95)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester$HarvestWithIndexProcessor.process(AbstractHarvester.java:568)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester.harvest(AbstractHarvester.java:639)
> at
>
org.fao.geonet.kernel.harvest.harvester.HarvesterJob.execute(HarvesterJob.java:69)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
> at
>
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
> ---------------------
>
> Can someone clarify what happens and why I'm not able to import metadata
> ignoring validation errors?
>
> Cheers,
> Carlo
>
>

--
Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--

*Vriendelijke groeten / Kind regards,Jose García
<http://www.geocat.net/&gt;Veenderweg 136721 WD BennekomThe NetherlandsT: +31
(0)318 416664 <+31318416664>Please consider the environment before printing
this email.*

Dear Jose,
thanks for your interest.
(please, check my response inline)

Not really sure, but from the log snippet seem the metadata you're trying

to import is not recognised:

Interesting, so I need the metadata template locally even if I don't need
to validate the imported data?

Which harvester are you using, CSW or GeoNetwork harvester?

I'm using the GeoNetwork one, with CSW I got better results having (with
the same configurations, see below)

What is the format of the metadata?

We have 9417 metadata file in different formats and with the geonetwork
harvester I've the following report:
-------------------------------
*9417* record(s) harvested in *858* seconds

4 hours ago

   - added: 46
   - datasetUuidExist: 1
   - total: 9417
   - unretrievable: 9371
   - updated: 1

  -------------------------------

For the instance the metadata in the log is based on the following template:

ISO 19115:2003/19139

I've also tried to use the CSW protocol which works quite well but I can
only get the public metadata (about 2500 on 9400) from the target
GeoNetwork even if I set the authentication which apparently is not working
at all (no log is reporting this problem).

Best regards,
Carlo

Regards,
Jose García

On Mon, Mar 2, 2020 at 2:09 PM carlo cancellieri <
geo.ccancellieri@anonymised.com> wrote:

Dear All,

could someone help me to understand the problem or what I'm missing?

Thank you,
C.

Il giorno mer 26 feb 2020 alle ore 12:51 carlo cancellieri <
geo.ccancellieri@anonymised.com> ha scritto:

> Hi list,
>
> I'm trying to harvest the metadata from a 3.8 geonetwork to a 3.10.1
but
> most of the metadata is skipped with the message:
> <<Skipped unretrievable metadata (maybe has been removed) >>
>
> Looking at the source geonetwork instance the metadata is there but has
> some errors, some time on ISO rules, some times on the schema
validation.
>
> I'm expecting the validation should be skipped as specified in the
> harvester configuration:
> Validate records before import
> *Accept all metadata without validation *
>
> then the logs (see below):
> << Missing scheme >>
>
> Log:
> -------------------------
> 2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped unretrievable
> metadata (maybe has been removed) with
> uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
> 2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
> java.lang.IllegalArgumentException: Missing scheme
> at java.nio.file.Paths.get(Paths.java:134)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.uriToPath(ZipFileSystemProvider.java:85)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:166)
> at java.nio.file.FileSystems.getFileSystem(FileSystems.java:221)
> at org.fao.geonet.ZipUtil.getOrCreateZipFs(ZipUtil.java:88)
> at org.fao.geonet.ZipUtil.openZipFs(ZipUtil.java:81)
> at org.fao.geonet.kernel.mef.MEFLib.getMEFVersion(MEFLib.java:169)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.addMetadata(Aligner.java:382)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.align(Aligner.java:223)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Harvester.harvest(Harvester.java:230)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.GeonetHarvester.doHarvest(GeonetHarvester.java:95)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester$HarvestWithIndexProcessor.process(AbstractHarvester.java:568)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester.harvest(AbstractHarvester.java:639)
> at
>
org.fao.geonet.kernel.harvest.harvester.HarvesterJob.execute(HarvesterJob.java:69)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
> at
>
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
> ---------------------
>
> Can someone clarify what happens and why I'm not able to import metadata
> ignoring validation errors?
>
> Cheers,
> Carlo
>
>

--
Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--

*Vriendelijke groeten / Kind regards,Jose García
<http://www.geocat.net/&gt;Veenderweg 136721 WD BennekomThe NetherlandsT: +31
(0)318 416664 <+31318416664>Please consider the environment before printing
this email.*

--
Dott. Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

Hi Carlo

See feedback inline.

Regards,
Jose García

On Mon, Mar 9, 2020 at 11:38 AM carlo cancellieri <
geo.ccancellieri@anonymised.com> wrote:

Dear Jose,
thanks for your interest.
(please, check my response inline)

Not really sure, but from the log snippet seem the metadata you're trying

to import is not recognised:

Interesting, so I need the metadata template locally even if I don't need
to validate the imported data?

You need to have the schemas, you don't need to load the templates of the
schemas. GeoNetwork already contains several schemas: iso19139, Dublin
Core, iso19139 and iso19115-3.2018 (this one since 3.10).

Apparently GeoNetwork is not able to match the metadata that is harvested
against any of the installed schemas.

Can you provide the url of the server being harvested?

Which harvester are you using, CSW or GeoNetwork harvester?

I'm using the GeoNetwork one, with CSW I got better results having (with
the same configurations, see below)

What is the format of the metadata?

We have 9417 metadata file in different formats and with the geonetwork
harvester I've the following report:
-------------------------------
*9417* record(s) harvested in *858* seconds

4 hours ago

   - added: 46
   - datasetUuidExist: 1
   - total: 9417
   - unretrievable: 9371
   - updated: 1

  -------------------------------

For the instance the metadata in the log is based on the following
template:

ISO 19115:2003/19139

I've also tried to use the CSW protocol which works quite well but I can
only get the public metadata (about 2500 on 9400) from the target
GeoNetwork even if I set the authentication which apparently is not working
at all (no log is reporting this problem).

Best regards,
Carlo

Regards,
Jose García

On Mon, Mar 2, 2020 at 2:09 PM carlo cancellieri <
geo.ccancellieri@anonymised.com> wrote:

Dear All,

could someone help me to understand the problem or what I'm missing?

Thank you,
C.

Il giorno mer 26 feb 2020 alle ore 12:51 carlo cancellieri <
geo.ccancellieri@anonymised.com> ha scritto:

> Hi list,
>
> I'm trying to harvest the metadata from a 3.8 geonetwork to a 3.10.1
but
> most of the metadata is skipped with the message:
> <<Skipped unretrievable metadata (maybe has been removed) >>
>
> Looking at the source geonetwork instance the metadata is there but has
> some errors, some time on ISO rules, some times on the schema
validation.
>
> I'm expecting the validation should be skipped as specified in the
> harvester configuration:
> Validate records before import
> *Accept all metadata without validation *
>
> then the logs (see below):
> << Missing scheme >>
>
> Log:
> -------------------------
> 2020-02-26 11:08:08,763 DEBUG [rkp_Fao_Maps] - - Skipped
unretrievable
> metadata (maybe has been removed) with
> uuid:f85144f0-88fd-11da-a88f-000d939bc5d8
> 2020-02-26 11:08:08,764 ERROR [rkp_Fao_Maps] - Missing scheme
> java.lang.IllegalArgumentException: Missing scheme
> at java.nio.file.Paths.get(Paths.java:134)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.uriToPath(ZipFileSystemProvider.java:85)
> at
>
com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:166)
> at java.nio.file.FileSystems.getFileSystem(FileSystems.java:221)
> at org.fao.geonet.ZipUtil.getOrCreateZipFs(ZipUtil.java:88)
> at org.fao.geonet.ZipUtil.openZipFs(ZipUtil.java:81)
> at org.fao.geonet.kernel.mef.MEFLib.getMEFVersion(MEFLib.java:169)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.addMetadata(Aligner.java:382)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Aligner.align(Aligner.java:223)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.Harvester.harvest(Harvester.java:230)
> at
>
org.fao.geonet.kernel.harvest.harvester.geonet.GeonetHarvester.doHarvest(GeonetHarvester.java:95)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester$HarvestWithIndexProcessor.process(AbstractHarvester.java:568)
> at
>
org.fao.geonet.kernel.harvest.harvester.AbstractHarvester.harvest(AbstractHarvester.java:639)
> at
>
org.fao.geonet.kernel.harvest.harvester.HarvesterJob.execute(HarvesterJob.java:69)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
> at
>
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
> ---------------------
>
> Can someone clarify what happens and why I'm not able to import
metadata
> ignoring validation errors?
>
> Cheers,
> Carlo
>
>

--
Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

--

*Vriendelijke groeten / Kind regards,Jose García
<http://www.geocat.net/&gt;Veenderweg 136721 WD BennekomThe NetherlandsT: +31
(0)318 416664 <+31318416664>Please consider the environment before printing
this email.*

--
Dott. Carlo Cancellieri
*skype*: ccancellieri
*Twitter*: @cancellieric
*LinkedIn*: http://it.linkedin.com/in/ccancellieri/

--

*Vriendelijke groeten / Kind regards,Jose García
<http://www.geocat.net/&gt;Veenderweg 136721 WD BennekomThe NetherlandsT: +31
(0)318 416664 <+31318416664>Please consider the environment before printing
this email.*