[GeoNetwork-devel] Forward slashes in metadata identifier

Hello,

I am working with an institution considering to migrate its INSPIRE metadata catalogue towards GeoNetwork. As part of this migration, all current metadata records (gmd:Metadata xml files) must be imported into GeoNetwork preserving all data, most importantly the gmd:fileIdentifier. The issue is that one of the resource providers uses a custom scheme for this fileIdentifier, more specifically the fileidentifier is the URL to the logical location of that file on their infrastructure (eg: http://some.domain.com/code_space/resource_code/resource_type).

As fileIdentifiers are used as uuids, the forward slahes in the identifier break pretty much all functionality of GeoNetwork related to interaction with those metadata records (download, edit, validation, etc).

I have succeeded, after tinkering with GeoNetwork for a couple of days, to restore much functionality by URI-encoding the identifier, but this required modifying Tomcat configuration, url rewrite rules in GeoNetwork (they where rewriting encoded links), the servlet configuration (luckily enough, no java code), as well as going through the web ui and enclosing in a call to encodeURIComponent any apparent use of the uuid in an url, and applying the filter of same name to every use of the uuid in angular fields.

Unfortunately, I was not able to get all test passing on an unmodified 3.4.4 tag (tried both Windows 10 and ubuntu server), so I am unable to check whether this breaks any functionality. Suffice to say that I’m not comfortable to mixing forward slashes with encoded forward slashes in the URL path (most browsers decode them anyway, so it’s hard to visually check if the HTTP request is made to http://localhost:8080/geonetwork/srv/api/records/http%3A%2F%2Fsome.domain.com%2Fcodespace%2Fcode%2FMD/formatters/xsl-view?root=div&output=pdf or
http://localhost:8080/geonetwork/srv/api/records/http://some.domain.com/codespace/code/MD/formatters/xsl-view?root=div&output=pdf ).

Is there any plan to change the way fileIdentifiers are dealt with in GeoNetwork (e.g. moving them to query parameters, as already done in some cases)? Is there any other solution to allowing slashes in fileidentifiers without extensive to the code base (maybe decoupling fileIdentifier from uuid, while still preserving the original file identifier in the UI and xml md files served by GeoNetwork)? Would my changes be of any use? Since I’m unable to test all functionality, I am not inclined to make a pull request; I’ll commit the changes to my fork in the next days, once I fully understand where exactly should I change the configuration file in the sources.

Daniel Urda

Hello, it looks like this is a requirement we should take care of - last week I also had one question about support of “&” in UUIDs and that is causing same type of issues. So I would say if you have a PR to make progress on this it would be good. As you mentioned, encode UUID in web ui is the main point. There is probably some cases when building links in XSLT and maybe on the Java side, when harvesting records. That the main point to fix probably. We will not move back to UUID as URL parameters (which also did not support & or = in) because they are now part of the API path structure.

Looking forward your PR (you should target master branch). Thanks

Francois

Le jeu. 25 oct. 2018 à 18:09, Daniel Urda <daniel.urda.ct@anonymised.com> a écrit :

Hello,

I am working with an institution considering to migrate its INSPIRE metadata catalogue towards GeoNetwork. As part of this migration, all current metadata records (gmd:Metadata xml files) must be imported into GeoNetwork preserving all data, most importantly the gmd:fileIdentifier. The issue is that one of the resource providers uses a custom scheme for this fileIdentifier, more specifically the fileidentifier is the URL to the logical location of that file on their infrastructure (eg: http://some.domain.com/code_space/resource_code/resource_type).

As fileIdentifiers are used as uuids, the forward slahes in the identifier break pretty much all functionality of GeoNetwork related to interaction with those metadata records (download, edit, validation, etc).

I have succeeded, after tinkering with GeoNetwork for a couple of days, to restore much functionality by URI-encoding the identifier, but this required modifying Tomcat configuration, url rewrite rules in GeoNetwork (they where rewriting encoded links), the servlet configuration (luckily enough, no java code), as well as going through the web ui and enclosing in a call to encodeURIComponent any apparent use of the uuid in an url, and applying the filter of same name to every use of the uuid in angular fields.

Unfortunately, I was not able to get all test passing on an unmodified 3.4.4 tag (tried both Windows 10 and ubuntu server), so I am unable to check whether this breaks any functionality. Suffice to say that I’m not comfortable to mixing forward slashes with encoded forward slashes in the URL path (most browsers decode them anyway, so it’s hard to visually check if the HTTP request is made to http://localhost:8080/geonetwork/srv/api/records/http%3A%2F%2Fsome.domain.com%2Fcodespace%2Fcode%2FMD/formatters/xsl-view?root=div&output=pdf or
http://localhost:8080/geonetwork/srv/api/records/http://some.domain.com/codespace/code/MD/formatters/xsl-view?root=div&output=pdf ).

Is there any plan to change the way fileIdentifiers are dealt with in GeoNetwork (e.g. moving them to query parameters, as already done in some cases)? Is there any other solution to allowing slashes in fileidentifiers without extensive to the code base (maybe decoupling fileIdentifier from uuid, while still preserving the original file identifier in the UI and xml md files served by GeoNetwork)? Would my changes be of any use? Since I’m unable to test all functionality, I am not inclined to make a pull request; I’ll commit the changes to my fork in the next days, once I fully understand where exactly should I change the configuration file in the sources.

Daniel Urda


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork