[Geoserver-devel] Direct download links from CSW

Hi,
in this mail I would like to discuss a “mini” proposal, a new feature for CSW (you tell me if you believe
it’s going to be big enough to warrant a formal proposal): direct download links for file based layers.

Direct downloads are a request we keep on seeing in the Metoc/EO world, layers are based on complex NetCDF/Grib files, and the protocols we have to output them, WCS in particular, are setup to allow slice, rescale, reproject data, but not exactly to preserve the original data as is.

Certainly the data could be provided via FTP too, but it’s combersome to have to publish data twice, and it can be somehow hard to locate as FTP provides no metadata at all.

So, for these layers, we would like to provide a direct download of the raw data from CSW, that
makes the downloads searchable and there is already room foreseen in that protocols for direct
download, via the “term-references” entry in dublin core metadata, and OnlineResource in ISO

The idea would be that we create a link pointing back to the CSW service, to a vendor request that will assemble the files for the requested resource, zip them, and send the result back to the requestor, something like:

http://host:port/geoserver/ows?service=CSW&version=2.0.1&request=DirectDownload&resourceId=

In order to locate the files we’d be using the information provided by FileResourceInfo/FileServiceInfo, as discussed recently on geotools-devel.

Of course we would not automatically create those links, we’ll have configurations at the CSW level, and at the layer level as an override, to allow direct download to be available.

Multidimensional Image mosaics are also of interest, and they add an interesting twist: they are a aggregate resource, and we already expose the granule of the mosaic in WCS-EO extended describe calls.
However, for direct downloads, we are looking not for the granules, but for files, and when the mosaics are made of NetCDF/Grib files, the file contains many granules.
So the idea is to have many download links associated to the mosaic, one per file contained in it.

Now, there is a catch… how does the user find out which dimension ranges are contained in each file? In ISO OnlineResource we could many abuse the description field to stick some json payload describing the ranges, but it would be ugly, and would not work for dublin core regardless.
So here is the idea: the link to the direct downolad will contain extra parameters, not used by the DirectDownload method, that do describe the file information ranges, something like:

http://host:port/geoserver/ows?service=CSW&version=2.0.1&request=DirectDownload&resourceId=:&time=from/to&elevation=from/to&bbox=xmin,ymin,xmax,ymax&custDim=from/to

The link will contain dimensions represented as ranges to keep it brief, we cannot build a url that contains all the single values, it would get too long.

For completeness sake, we also thought about generating a stand alone metadata record for each file in case of mosaics: it would have looked good, but there is a serious catch, it would have broken big time record paging, basically in order to discover what’s at page 10 or the record list one would have had to go, locate all resources that might have generated extra records in the first 9 pages, expand them… making paging over CSW unbearably slow.
So the idea, while nice from other points of view, got canned

Anyways… opinions?

Cheers
Andrea

···

==
Meet us at the INSPIRE Conference in Lisbon 25-29 May 2015! Visit http://goo.gl/WHKDXT for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.