[GeoNetwork-devel] Id number of download URL gets out of sync. with data folder Id

Hi Developers,

When a MEF file is uploaded it may contain an internal id number in URL's
and if that id number folder is already used in the data directory
a new folder/id no. is allocated. This results in the id no. referrred to in
Online resource URL of 'data file for download' to become out
of synchronisation with the id number folder allocated on the upload.

For example say we have already used up id nos. 1 to 10 in the data folder
and then upload a MEF file containing a metadata.xml
with a URL referencing id. no. 10 i.e:

         <gmd:onLine>
            <gmd:CI_OnlineResource>
              <gmd:linkage>
                <gmd:URL>http://localhost:8080/geonetwork/srv/en/resources.get?id=10&amp;fname=basins.zip&amp;access=private&lt;/gmd:URL&gt;
              </gmd:linkage>
              <gmd:protocol>
                <gco:CharacterString>WWW:DOWNLOAD-1.0-http--download</gco:CharacterString>
              </gmd:protocol>
              <gmd:name>
                <gco:CharacterString>basins.zip</gco:CharacterString>
              </gmd:name>
              <gmd:description>
                <gco:CharacterString>Hydrological basins in Africa (Shapefile Format)</gco:CharacterString>
              </gmd:description>
            </gmd:CI_OnlineResource>
          </gmd:onLine>

That metadata and its resource (data file for download = basins.zip) ends up being placed into
the next data folder no. 11/private/basins.zip (the next available number). Then we see that the URL:

http://localhost:8080/geonetwork/srv/en/resources.get?id=10&amp;fname=basins.zip&amp;access=private

is now wrong because it refers to 10 but it should be 11 and we end up with a 'resource not found' error.

So how to solve this, here are some options:

1) Run the update-fixed-info.xsl script after you do the MEF upload. This will
get the id. nos back in sync. as Simon (thanks, Simon!) has suggested
(see discussion thread on this, me/Simon in http://trac.osgeo.org/geonetwork/ticket/1080)
Note that this xsl might need to run in batch mode because you may have uploaded
many MEF files that got out of sync.

2) Run an xsl (call it re-synch-id-no.xsl) after mef import. This would adjust the id no in the gmd:URL
to the allocated GN catalog id. Perhaps implement as a check box on the single file and
batch import forms, named 'sync. id no.'

I prefer option 2) above, it just fixes what you need whereas option 1) goes and changes lots of
other things that maybe you don't need.

Any more ideas and thoughts on this issue?

Regards,
Andrew

PS: Simon has raised the point that we should remove the dependency on the database id and replace it with
the metadata uuid in these local resource URL. The uuid would be a more robust URL. Perhaps this is
an enhancement for a future version.

Hi Andrew,

Actually it looks like the uuid option you mentioned in the PS section of your email could already be implemented in the download services - this would be simple to use for 2.8 as we could just modify the update-fixed-info.xsl for each schema to build the download URL with the uuid and rebuild the sample data records to use it and the problem should go away. I'll check into it later today to see whether everything still works with this change. If so then we should be able to commit this change to 2.8.x as both the uuid (and the old id) option will/continue to work.

Cheers,
Simon
________________________________________
From: andrew walsh [awalsh@anonymised.com]
Sent: Thursday, 18 October 2012 2:43 PM
To: geonetwork-devel@lists.sourceforge.net
Subject: [GeoNetwork-devel] Id number of download URL gets out of sync. with data folder Id

Hi Developers,

When a MEF file is uploaded it may contain an internal id number in URL's
and if that id number folder is already used in the data directory
a new folder/id no. is allocated. This results in the id no. referrred to in
Online resource URL of 'data file for download' to become out
of synchronisation with the id number folder allocated on the upload.

For example say we have already used up id nos. 1 to 10 in the data folder
and then upload a MEF file containing a metadata.xml
with a URL referencing id. no. 10 i.e:

         <gmd:onLine>
            <gmd:CI_OnlineResource>
              <gmd:linkage>
                <gmd:URL>http://localhost:8080/geonetwork/srv/en/resources.get?id=10&amp;fname=basins.zip&amp;access=private&lt;/gmd:URL&gt;
              </gmd:linkage>
              <gmd:protocol>
                <gco:CharacterString>WWW:DOWNLOAD-1.0-http--download</gco:CharacterString>
              </gmd:protocol>
              <gmd:name>
                <gco:CharacterString>basins.zip</gco:CharacterString>
              </gmd:name>
              <gmd:description>
                <gco:CharacterString>Hydrological basins in Africa (Shapefile
Format)</gco:CharacterString>
              </gmd:description>
            </gmd:CI_OnlineResource>
          </gmd:onLine>

That metadata and its resource (data file for download = basins.zip) ends up
being placed into
the next data folder no. 11/private/basins.zip (the next available number). Then
we see that the URL:

http://localhost:8080/geonetwork/srv/en/resources.get?id=10&amp;fname=basins.zip&amp;access=private

is now wrong because it refers to 10 but it should be 11 and we end up with a
'resource not found' error.

So how to solve this, here are some options:

1) Run the update-fixed-info.xsl script after you do the MEF upload. This will
get the id. nos back in sync. as Simon (thanks, Simon!) has suggested
(see discussion thread on this, me/Simon in
#1080 (Local file for download checkbox+link not showing on the "download summary" form) – GeoNetwork opensource Developer website)
Note that this xsl might need to run in batch mode because you may have uploaded
many MEF files that got out of sync.

2) Run an xsl (call it re-synch-id-no.xsl) after mef import. This would adjust
the id no in the gmd:URL
to the allocated GN catalog id. Perhaps implement as a check box on the single
file and
batch import forms, named 'sync. id no.'

I prefer option 2) above, it just fixes what you need whereas option 1) goes and
changes lots of
other things that maybe you don't need.

Any more ideas and thoughts on this issue?

Regards,
Andrew

PS: Simon has raised the point that we should remove the dependency on the
database id and replace it with
the metadata uuid in these local resource URL. The uuid would be a more robust
URL. Perhaps this is
an enhancement for a future version.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:

_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net

GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork