[GeoNetwork-devel] Question about harvests

Hi All,

This is a question about a possible enhancement, but also to check how things should currently work.

The scenario: I have two geonetwork servers, one on an intranet and one accessible to the outside world. The metadata is created on the intranet server, and then the external server harvests some but not all of the records from the intranet server, using the geonetwork node option. data.gov.uk then harvests from the CSW endpoint for the external server.

My question is around the embedded URLs in the metadata. These are currently picked up from the URL and port setting for the internal server. However this is not accessible to the outside world. I expected that the harvesting process onto the external server would automatically update these links to use the URL and port setting for the external server but this doesn’t seem to be the case.

I know I can change the URLs using an xsl as part of the harvesting process, but these seems quite a basic requirement so I was wondering if what I am seeing is expected behaviour, or a bug, and if this would be a good enhancement request?

I’ve got a second question too, about the port setting. I would expect that if no port was set for either protocol, eg the value in the settings tables are blank, that the internal URLs would not contain a :. However, I think they still do- is this also expected behaviour or a bug?

Thanks

Jo

···

Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009

Hi Jo

See feedback inline.

Regards,
Jose García

···

On Tue, Jul 17, 2018 at 12:58 PM, Jo Cook <jocook@anonymised.com> wrote:

Hi All,

This is a question about a possible enhancement, but also to check how things should currently work.

The scenario: I have two geonetwork servers, one on an intranet and one accessible to the outside world. The metadata is created on the intranet server, and then the external server harvests some but not all of the records from the intranet server, using the geonetwork node option. data.gov.uk then harvests from the CSW endpoint for the external server.

My question is around the ebedded URLs in the metadata. These are currently picked up from the URL and port setting for the internal server. However this is not accessible to the outside world. I expected that the harvesting process onto the external server would automatically update these links to use the URL and port setting for the external server but this doesn’t seem to be the case.

I know I can change the URLs using an xsl as part of the harvesting process, but these seems quite a basic requirement so I was wondering if what I am seeing is expected behaviour, or a bug, and if this would be a good enhancement request?

What harvester are you using from internal to external instances? GeoNetwork harvester should deal with the update of the urls if they are related to resources uploaded to the metadata (at least used to work like that in previous versions afaik), but not the CSW one.

I’ve got a second question too, about the port setting. I would expect that if no port was set for either protocol, eg the value in the settings tables are blank, that the internal URLs would not contain a :. However, I think they still do- is this also expected behaviour or a bug?

The values should be filled, currently are not handle with default values if empty. This can be an improvement to add.

Thanks

Jo

Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009


Astun Technology Ltd, The Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
t:+44 1372 744 009 w: astuntechnology.com twitter:@astuntech

iShare - enterprise geographic intelligence platform
GeoServer, PostGIS and QGIS training
Helpdesk and customer portal

Company registration no. 5410695. Registered in England and Wales. Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.


Check out the vibrant tech community on one of the world’s most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


GeoNetwork-devel mailing list
GeoNetwork-devel@…537…sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.

Hi Jose

···

It’s the geonetwork harvester. Both are 3.4.3. The URLs that I have noticed are not changed are the gemet thesauri anchors but I haven’t checked any others.

Thanks, I’ll add an enhancement request.

Jo

What harvester are you using from internal to external instances? GeoNetwork harvester should deal with the update of the urls if they are related to resources uploaded to the metadata (at least used to work like that in previous versions afaik), but not the CSW one.

I’ve got a second question too, about the port setting. I would expect that if no port was set for either protocol, eg the value in the settings tables are blank, that the internal URLs would not contain a :. However, I think they still do- is this also expected behaviour or a bug?

The values should be filled, currently are not handle with default values if empty. This can be an improvement to add.

Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009

Hi

I just checked the code for GeoNework harvester and doesn’t handle the links. But I notice that in a custom project I added the code in the harvester to update the links. I’ll check to make a pull request for this work during the week.

Regards,
Jose García

···

On Tue, Jul 17, 2018 at 2:37 PM, Jo Cook <jocook@anonymised.com> wrote:

Hi Jose


Astun Technology Ltd, The Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
t:+44 1372 744 009 w: astuntechnology.com twitter:@astuntech

iShare - enterprise geographic intelligence platform
GeoServer, PostGIS and QGIS training
Helpdesk and customer portal

Company registration no. 5410695. Registered in England and Wales. Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.

It’s the geonetwork harvester. Both are 3.4.3. The URLs that I have noticed are not changed are the gemet thesauri anchors but I haven’t checked any others.

Thanks, I’ll add an enhancement request.

Jo

Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009

What harvester are you using from internal to external instances? GeoNetwork harvester should deal with the update of the urls if they are related to resources uploaded to the metadata (at least used to work like that in previous versions afaik), but not the CSW one.

I’ve got a second question too, about the port setting. I would expect that if no port was set for either protocol, eg the value in the settings tables are blank, that the internal URLs would not contain a :. However, I think they still do- is this also expected behaviour or a bug?

The values should be filled, currently are not handle with default values if empty. This can be an improvement to add.

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.