[Geoserver-devel] Improving how importer deals with raster data

Hi,
I would like to improve the importer to make it support a very common
image publication workflow that it’s currently not supported, and discuss
the direction of changes to be performed.

The workflow is more or less as follows:

  1. The client indicates that one or more “golden” images need to be importer
    into GeoServer for WMS/WCS usage. The original files are big, and must
    not be touched, so ideally the client would just tell GeoServer where the
    files are (disk shares) and GeoServer should copy them autonomously
  2. Once the images are in GeoServer possession, the code should run
    a gdal_translate the adjust the image structure (tiling, compression, bit depth,
    you name it) and gdaladdo to add overviews.
  3. The resulting files should be configured as new layers, or as a replacement
    for existing target layers

All three points require some changes/improvements to the importer.

Point 1) could be solved by http upload, but given the size of the files, and given
they are already available in a network accessible share, it would be best
to have them be copied instead.
I guess this could be done by adding a copy : “true” directive in the import data?
E.g:

“data”: {
“type”: “directory”,
“location”: “/mnt/share/myImage.tif”,
“copy”: true
}

As an alternative, the copy step could be a ImportTransform, but the current
transforms have no way to alter the ImportData they receive.
We could either make ImportData subclasses mutable, but I’m worried they
are immutable for a reason (storage, replay), or allow pre-tranform to return
a different ImportData, e.g…

public interface PreTransform extends Transform {

ImportData apply(ImportTask task, ImportData data) throws Exception;
}

Wondering what people would like best here?

Going to point 2), the current RasterTransformChain/RasterTransform
interfaces are dead code, there are no implementations, so I guess
we have a blank sheet here.
I was thinking that the calls to gdaltranslate/gdaladdo would be pre-transform
types. Param wise, we can either try to setup a limite set of well know params,
or have the caller just give us the straight parameters they want us to use
(besides the file names, that is). I’m lending towards the latter, it’s more flexible
and allows to use the full set of abilities of the gdal installation on the server.
E…g, we cannot predict in advance if people just need to compress, which
compression params they want to use, if they need to reorder or shave off
bands, reduce the bit depth, expand from palette to rgb, and so on
The same goes for gdaladdo, the common params are … common, but
if you start dealing with overviews compression, you have extra params there
too.

At this point we could get to point 3), and here things get interesting.
If the target store does not exist, we can just configure a new one, done.
If anything, we’d have to decide where to put the files.
For a direct import, that’s easy, we could just have the copy mode upload
them in their final position.

But what if we need to update an existing layer instead? It means we have
to do the processing (which might take time) leaving the target files alone
(this would be an indirect import with “replace” mode in the task I guess?)
In any case, we probably need to ask the store where its files are, to perform
the replacement, something we do not have today.

I was thinking of allowing stores and readers to implement a FileSource interface,
that would have a single method:

List getFiles();

which would give us the location of the files used by the store, we would
then delete those, and copy over the new ones.

Since we are dealing with files that OGC requests might be using, and
since we try to have GeoServer work on Windows, the above might not be a straight
file operation, but more something like:

  • Disable the resource/layer so that no new OGC requests can use it
  • Try to perform the replacement (with retries, as on Windows the files might still
    be locked by OGC requests that were already working on the files)
  • Re-enable the layer once done

I guess this would form the mean of the raster indirect import (for simple rasters
at least, structured grid coverage readers might have to follow a different path,
e.g., harvest the new granules in an existing mosaic).

Ok, that’s all I had in mind… opinions, feedback welcomed.

Cheers
Andrea

···

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Hi all,
following up on this topic, I’ve implemented the support for gdal_translate and gdaladdo in
the importer, generalizing a bit the tranfsormation chains, and providing support for other
command line invocations, should we want to add more in the future.

You can look at the work here:
https://github.com/aaime/geoserver/commit/fefb4da90259a1e14e7a57d786ff4c57bbdacd9f

The transforms are to be used in a direct import, and are available only from the REST api
(not the first example, there are others already).

Say you want to upload a tiff file and then retile/add overviews to it, you’d use the following commands:

  1. Create import:
    curl -u admin:geoserver -XPOST -H “Content-type: application/json” -d @import.jsonhttp://localhost:8080/geoserver/rest/imports
···

Where import.json is:

{
“import”: {
“targetWorkspace”: {
“workspace”: {
“name”: “sf”
}
}
}
}

  1. Upload the tiff file:

curl -u admin:geoserver -F name=test -F filedata=@/home/aaime/devel/gisData/rain.tif “http://localhost:8080/geoserver/rest/imports/0/tasks

  1. Append the transformations:

curl -u admin:geoserver -XPOST -H “Content-type: application/json” -d @gtx.jsonhttp://localhost:8080/geoserver/rest/imports/0/tasks/0/transforms
curl -u admin:geoserver -XPOST -H “Content-type: application/json” -d @gad.jsonhttp://localhost:8080/geoserver/rest/imports/0/tasks/0/transforms

Where gtx.json is:

{
“type”: “GdalTranslateTransform”,
“options”: [ “-co”, “TILED=YES”, “-co”, “BLOCKXSIZE=512”, “-co”, “BLOCKYSIZE=512”]
}

and gad.json is:

{
“type”: “GdalAddoTransform”,
“options”: [ “-r”, “average”],
“levels” : [2, 4, 8, 16]
}

  1. Run the import:

curl -u admin:geoserver -XPOST “http://localhost:8080/geoserver/rest/imports/0

If people have no objections, I’ll use the above example in the docs, and turn that github link above in a pull request

Not sure if you have feedback on the other two topics in my first mail in this thread (have the server fetch files from
shares, indirect raster import), if I don’t hear anything I’ll pick my choice next week and more on to implement those
as well

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


On Fri, Apr 24, 2015 at 4:07 PM, Andrea Aime <andrea.aime@anonymised.com>
wrote:

Hi all,
following up on this topic, I've implemented the support for
gdal_translate and gdaladdo in
the importer, generalizing a bit the tranfsormation chains, and providing
support for other
command line invocations, should we want to add more in the future.

Oh, forgot one thing, currently the code assumes gdal_translate/gdaladdo
are in the path.
Fair assumption for Linux, may not be so for Windows.

I was thinking of where to configure the location of the commands, importer
currently
has no configuration at all.
Shall we go for the usual property file? $DATA_DIR/importer/gdal.properties?
I was also thinking of a $DATA_DIR/importer.xml, based of a ImporterInfo
bean,
which could have a <gdal> section inside.

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

Hi all,
pull request adding gdaladdo/gdalwarp/gdaltranslate here:

https://github.com/geoserver/geoserver/pull/1028

I did not get feedback on the copy thing, since it’s been out a week,
I’m assuming that at least the proposed idea is not outrageusly wrong,
so I’m going to implement it as proposed (a copy:true attribute in the
source data when we want GeoServer to copy the data locally by
itself instead of having to upload it or ingest it directly from the
source position).

Cheers
Andrea

···

On Fri, Apr 24, 2015 at 4:34 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Apr 24, 2015 at 4:07 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi all,
following up on this topic, I’ve implemented the support for gdal_translate and gdaladdo in
the importer, generalizing a bit the tranfsormation chains, and providing support for other
command line invocations, should we want to add more in the future.

Oh, forgot one thing, currently the code assumes gdal_translate/gdaladdo are in the path.
Fair assumption for Linux, may not be so for Windows.

I was thinking of where to configure the location of the commands, importer currently
has no configuration at all.
Shall we go for the usual property file? $DATA_DIR/importer/gdal.properties?
I was also thinking of a $DATA_DIR/importer.xml, based of a ImporterInfo bean,
which could have a section inside.

Cheers

Andrea

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.