[Geoserver-users] Large NetCDF file errors

Hi all,

I'm trying to use Geoserver with large NetCDF files and have been encountering some issues.

When working with large rasters, indexing takes a long time and causes the front end and REST api to become unresponsive. Some of the climate data I work with is ~500*1000*55000*32bit*3variables (150 years of downscaled daily data) = ~330GB per file. After ~3 hours of indexing however, it stops processing and errors out:

20 Jul 15:38:16 WARN [netcdf.NetCDFFormat] - Unable to connect
org.geotools.data.DataSourceException: Unable to connect
Caused by: org.geotools.data.DataSourceException: java.io.IOException: Error occured on rollback
Caused by: java.lang.RuntimeException: java.io.IOException: Error occured on rollback
Caused by: java.io.IOException: Error occured on rollback
Caused by: org.h2.jdbc.JdbcSQLException: IO Exception: java.io.IOException: Negative seek offset; /var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/.pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231_57daab961b0813f95a47bf4e3312d0f79cde766b/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.7212500018817963839.temp.db; SQL statement:
ROLLBACK [90031-119]
Caused by: java.io.IOException: Negative seek offset

20 Jul 15:38:16 INFO [geoserver.web] - Getting list of coverages for saved store /var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc
java.lang.RuntimeException: Could not list layers for this store, an error occurred retrieving them: Failed to create reader from /var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc and hints null
Caused by: java.io.IOException: Failed to create reader from /var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc and hints null

At that point it had created 78 <fname>.<fnum>.log.db files at 33MB each, with the last one (fnum=78) being 26MB, and a single ~2.3GB *.temp.db file.

I have measured read speeds on this source and they run ~350MB/s. We should be able to read (and index) the entire file in at most 15 minutes single threaded.

Does anyone else have experience using Geoserver with large NetCDF files?

Geoserver 2.7.1.1
Tomcat 7.0.42
Java 1.7.0_65-b32

NetCDF plugin from:
http://ares.opengeo.org/geoserver/2.7.x/community-latest/

Thanks,
Basil

--
Basil Veerman
Web Application Developer
Pacific Climate Impacts Consortium
http://www.pacificclimate.org/
Tel: (250) 721-6395

Dear Basil,
with the work we are doing as we speak we can move the internal index
in PostGIS rather than using H2, this could help.

Can you provide offline one of these files for us to play with?

Regards,
Simone Giannecchini

GeoServer Professional Services from the experts!
Visit http://goo.gl/it488V for more information.

Ing. Simone Giannecchini
@simogeo
Founder/Director

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 333 8128928

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------
AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate.
Il loro utilizzo è consentito esclusivamente al destinatario del
messaggio, per le finalità indicate nel messaggio stesso. Qualora
riceviate questo messaggio senza esserne il destinatario, Vi preghiamo
cortesemente di darcene notizia via e-mail e di procedere alla
distruzione del messaggio stesso, cancellandolo dal Vostro sistema.
Conservare il messaggio stesso, divulgarlo anche in parte,
distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità
diverse, costituisce comportamento contrario ai principi dettati dal
D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely
for the attention and use of the named addressee(s) and may be
confidential or proprietary in nature or covered by the provisions of
privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New
Data Protection Code).Any use not in accord with its purpose, any
disclosure, reproduction, copying, distribution, or either
dissemination, either whole or partial, is strictly forbidden except
previous formal approval of the named addressee(s). If you are not the
intended recipient, please contact immediately the sender by
telephone, fax or e-mail and delete the information in this message
that has been received in error. The sender does not give any warranty
or accept liability as the content, accuracy or completeness of sent
messages and accepts no responsibility for changes made after they
were sent or for other risks which arise as a result of e-mail
transmission, viruses, etc.

On Wed, Jul 22, 2015 at 8:28 PM, Basil Veerman <bveerman@anonymised.com> wrote:

Hi all,

I'm trying to use Geoserver with large NetCDF files and have been
encountering some issues.

When working with large rasters, indexing takes a long time and causes
the front end and REST api to become unresponsive. Some of the climate
data I work with is ~500*1000*55000*32bit*3variables (150 years of
downscaled daily data) = ~330GB per file. After ~3 hours of indexing
however, it stops processing and errors out:

20 Jul 15:38:16 WARN [netcdf.NetCDFFormat] - Unable to connect
org.geotools.data.DataSourceException: Unable to connect
Caused by: org.geotools.data.DataSourceException: java.io.IOException:
Error occured on rollback
Caused by: java.lang.RuntimeException: java.io.IOException: Error
occured on rollback
Caused by: java.io.IOException: Error occured on rollback
Caused by: org.h2.jdbc.JdbcSQLException: IO Exception:
java.io.IOException: Negative seek offset;
/var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/.pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231_57daab961b0813f95a47bf4e3312d0f79cde766b/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.7212500018817963839.temp.db;
SQL statement:
ROLLBACK [90031-119]
Caused by: java.io.IOException: Negative seek offset

20 Jul 15:38:16 INFO [geoserver.web] - Getting list of coverages for
saved store
/var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc
java.lang.RuntimeException: Could not list layers for this store, an
error occurred retrieving them: Failed to create reader from
/var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc
and hints null
Caused by: java.io.IOException: Failed to create reader from
/var/lib/tomcat7/webapps-available/geoserver2711/geoserver_data/climate-data/pr+tasmax+tasmin_day_BCCAQ+ANUSPLIN300+ACCESS1-0_historical+rcp45_r1i1p1_19500101-21001231.nc
and hints null

At that point it had created 78 <fname>.<fnum>.log.db files at 33MB
each, with the last one (fnum=78) being 26MB, and a single ~2.3GB
*.temp.db file.

I have measured read speeds on this source and they run ~350MB/s. We
should be able to read (and index) the entire file in at most 15 minutes
single threaded.

Does anyone else have experience using Geoserver with large NetCDF files?

Geoserver 2.7.1.1
Tomcat 7.0.42
Java 1.7.0_65-b32

NetCDF plugin from:
http://ares.opengeo.org/geoserver/2.7.x/community-latest/

Thanks,
Basil

--
Basil Veerman
Web Application Developer
Pacific Climate Impacts Consortium
http://www.pacificclimate.org/
Tel: (250) 721-6395

------------------------------------------------------------------------------
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users