[Geoserver-devel] FilePIPO implementation - enable WPS to return any kind of file

According to the general concept of WPS and I believe WPS specification, WPS should be able to return raw data.

The most generic raw data should be octet array (binary object in memory) or a file (binary object on file system) containing custom or proprietary data structures. Such file would have “application/octet-stream” MIME Type by default.

MIME Type and extension should be set at the time of object creation according to the real file structure and data contained (like .doc, .xls or .pdf). At the moment only particular file formats can be returned (like png, jpeg, tiff, arcgrid). GeoTiffPPIO implementation is rather specific and it goes into file structure checking CoordinateReferenceSystem or file extension. ImagePPIO uses com.sun.media.jai.codecimpl decoder and encoder. >From my point of view such rather detailed inspection od file structure or file specific implementation is not necessery and perheps not possible. I would like to return a .pdf file as a WPS output. Expected file size would be round 5 MB. This file is created using JasperReport library and I don’t have insight into its structure, basically I call a function “pdfBytes = JasperRunManager.runReportToPdf(jasperReport,parameters, datasource);” and I get byte array representing PDF file (there is no need to write this file on filesystem). In general JasperReport can provide output in many different output formats (like HTML, Excel, OpenOffice and Word.) and I don’t see any reason why this file types wouldn’t be supported as WPS output.

There already exists abstract class BinaryPPIO.java (package org.geoserver.wps.ppio), only missing an implementation for the “generic” file. I suggest implementation of FilePPIO and ByteArray class that would enable any file to be served as a result of WPS output. Each array (byte array also) can hold a maximum of Integer.MAX_VALUE values (2^31-1.) and this enables 2GB file size. Since this could be overkill for GeoServer, some arbitrary limit can be set on file size (like 15 MB, apache IOUtils has BoundedInputStream(InputStream, long)). File type can be determined at the moment of FilePPIO creation (passing MIME Type and appropriate file extension) or create abstract FilePPIO class and embed different subclasses like PDF, Word, Excel or similar (just following ImagePPIO example.)

Since there is no content inspection, each file or byte array is transferred on the same way, there are different approaches from byte by byte copy from source to outputstream or by using third party utils, like org.apache.commons.io.IOUtils (already used in GeoTiffPPIO) or like sun.misc.IOUtils;. I am not really sure what is “the most effective way”.

Here are few possible encoder implementations.

public void encodeFile(Object file, OutputStream os) throws Exception {

Files.copy(Paths.get(((File)file).getAbsolutePath()),os);

}

public void encodeFile(Object file, OutputStream os) throws Exception {

byte buf = new byte[8192];

InputStream is = new FileInputStream((File)file);

int c = 0;

while ((c = is.read(buf, 0, buf.length)) > 0) {

os.write(buf, 0, c);

os.flush();

}

//os.close();

is.close();

}

public void encodeFile2(Object file, OutputStream os) throws Exception {

InputStream is = new FileInputStream((File)file);

IOUtils.copy(is,os);

}

Srdačno, Krunoslav