Is there a way to consolidate netcdf files with multiple parameters into a single layer? We have requirements to display large amounts of model data from multiple sources. Currently, this involves one CoverageStore/Layer per unique data type. However this results in thousands of layers which causes the GeoServer boot very slowly. What we’d like to do is create one layer per model (GFS/NAM/etc) and use custom dimensions to specify the variable name. It doesn’t appear this is currently possible, but I figured I’d check here if I was just missing something. For reference, our data comes in as a single variable and time per netcdf file so our database would need to specify the correct file and variable name.
Alternatively, if there isn’t any way to combine the layers, is there any way to optimize the boot time? We currently have 5000+ layers and our last boot took 5673031 ms (~90 minutes). If anyone has any optimization tips or tricks I would greatly appreciate it.
On Thu, Jun 19, 2014 at 1:14 AM, Weiss, Kevin <kweiss01@anonymised.com> wrote:
Hello all,
Is there a way to consolidate netcdf files with multiple parameters into a
single layer? We have requirements to display large amounts of model data
from multiple sources. Currently, this involves one CoverageStore/Layer
per unique data type. However this results in thousands of layers which
causes the GeoServer boot very slowly. What we’d like to do is create one
layer per model (GFS/NAM/etc) and use custom dimensions to specify the
variable name. It doesn’t appear this is currently possible, but I figured
I’d check here if I was just missing something. For reference, our data
comes in as a single variable and time per netcdf file so our database
would need to specify the correct file and variable name.
There are a few "nD" solutions in the codebase (using a database used to
index which file to use on disk in response to time or elevation changing).
Alternatively, if there isn’t any way to combine the layers, is there any
way to optimize the boot time? We currently have 5000+ layers and our last
boot took 5673031 ms (~90 minutes). If anyone has any optimization tips or
tricks I would greatly appreciate it.
Kevin has been working on optimising the JDBC catalog implementation for
large number of layers similar to what you are working with. The "catalog "
is what geoserver uses to store all the layer names and bounds.
You could ask Kevin if you can help by testing a nightly build. You may
also wish to at some of the support <http://geoserver.org/support/>
options available.
On Wed, Jun 18, 2014 at 5:14 PM, Weiss, Kevin <kweiss01@anonymised.com> wrote:
Hello all,
Is there a way to consolidate netcdf files with multiple parameters into a
single layer? We have requirements to display large amounts of model data
from multiple sources. Currently, this involves one CoverageStore/Layer
per unique data type. However this results in thousands of layers which
causes the GeoServer boot very slowly. What we’d like to do is create one
layer per model (GFS/NAM/etc) and use custom dimensions to specify the
variable name. It doesn’t appear this is currently possible, but I figured
I’d check here if I was just missing something. For reference, our data
comes in as a single variable and time per netcdf file so our database
would need to specify the correct file and variable name.
Right now as you say variables are shown as sub-layers. Modifying in the
imagemosaic code base (no the jdbc one, it's not multidimensional capable
afaik) I believe it might be possible
to turn these variables into a dimension, assuming everything else is
consistent (data type, resolution).
Alternatively, if there isn’t any way to combine the layers, is there any
way to optimize the boot time? We currently have 5000+ layers and our last
boot took 5673031 ms (~90 minutes). If anyone has any optimization tips or
tricks I would greatly appreciate it.
That's a long time. Was it the first boot? The Netcdf needs to build a
couple of index files to speed up subsequent access to the contents, if
those are missing, it indeed takes some time to generate them. But it
should be a one time process
Cheers
Andrea
--
GeoServer Professional Services from the experts! Visit http://goo.gl/NWWaa2 for more information.
Ing. Andrea Aime @geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
On Thu, Jun 19, 2014 at 5:23 AM, Jody Garnett <jody.garnett@anonymised.com>
wrote:
Kevin has been working on optimising the JDBC catalog implementation for
large number of layers similar to what you are working with. The "catalog "
is what geoserver uses to store all the layer names and bounds.
One thing that I've been wondering about jdbc is what portiong of the
speedup is due to
not having to scan the file system and read xml files, and what is due to
not checking
if feature type/coverages are valid (something that we should do for the
file system based
loader too imho, since now we have the options to have caps documents
generated in spite
of misconfigured layers/stores)
Did anybody ever measure?
Cheers
Andrea
--
GeoServer Professional Services from the experts! Visit http://goo.gl/NWWaa2 for more information.
Ing. Andrea Aime @geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
Sorry for the delayed response, but I wanted to run an actual test on our server before I reported back. We are using the database to index .nc files based on time & elevation. I was hoping to combine parameters into that single store as well, but after testing the jdbcconfig plugin we may not need it.
After the initial dataload (importing all 5000+ layers into the DB) our server now boots in approximately 2 minutes which is a monumental improvement from the original 2 hours (and is going to look good on my year end review ). This allows us to continue our approach of one store/layer per unique datatype. Thanks to Kevin Smith, as well as everyone else on the project, for their work!
Kevin M. Weiss
Software Engineer
HARRIS IT Services
···
From: Jody Garnett [mailto:jody.garnett@…84…] Sent: Wednesday, June 18, 2014 10:24 PM To: Weiss, Kevin Cc:geoserver-users@lists.sourceforge.net; Kevin Smith Subject: Re: [Geoserver-users] Consolidation of netCDF layers
On Thu, Jun 19, 2014 at 1:14 AM, Weiss, Kevin <kweiss01@…5740…> wrote:
Hello all,
Is there a way to consolidate netcdf files with multiple parameters into a single layer? We have requirements to display large amounts of model data from multiple sources. Currently, this involves one CoverageStore/Layer per unique data type. However this results in thousands of layers which causes the GeoServer boot very slowly. What we’d like to do is create one layer per model (GFS/NAM/etc) and use custom dimensions to specify the variable name. It doesn’t appear this is currently possible, but I figured I’d check here if I was just missing something. For reference, our data comes in as a single variable and time per netcdf file so our database would need to specify the correct file and variable name.
There are a few “nD” solutions in the codebase (using a database used to index which file to use on disk in response to time or elevation changing).
Alternatively, if there isn’t any way to combine the layers, is there any way to optimize the boot time? We currently have 5000+ layers and our last boot took 5673031 ms (~90 minutes). If anyone has any optimization tips or tricks I would greatly appreciate it.
Kevin has been working on optimising the JDBC catalog implementation for large number of layers similar to what you are working with. The "catalog " is what geoserver uses to store all the layer names and bounds.