[Geoserver-devel] Storage for JDBC backed ResourceStore

I’ve been considering how to store the content of resources for a JDBC backed implementation of the new ResourceStore API.

It comes down to BLOB vs BINARY/bytea. On the database side, both H2 and Postgres support large enough in table types that BLOBs should not be needed and add complexity. Both DBMS’s drivers (at least the versions we’re using) however require loading the entire content into memory at once rather than streaming. This should be brief before the content is cached to disk and most individual resources are fairly small.

Some styling resources like large TTFs or images might go as large as a few megabytes but the goal is not to store any geodata.

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo

Hi Kevin,
for the files we have today I agree this should not be a concern… in the future, nobody
knows what new modules might need.
A bit concerned about the table size limits, Oracle might have issues with creating a table
that is too large without falling back on blobs (which are often stored outside of the table record).

Why can’t we use blob and provide a stream? Just because it’s more complex?
Or we would not be able to use streams with blobs regardless?

Cheers
Andrea

···

On Wed, Apr 30, 2014 at 6:46 PM, Kevin Smith <ksmith@anonymised.com> wrote:

I’ve been considering how to store the content of resources for a JDBC backed implementation of the new ResourceStore API.

It comes down to BLOB vs BINARY/bytea. On the database side, both H2 and Postgres support large enough in table types that BLOBs should not be needed and add complexity. Both DBMS’s drivers (at least the versions we’re using) however require loading the entire content into memory at once rather than streaming. This should be brief before the content is cached to disk and most individual resources are fairly small.

Some styling resources like large TTFs or images might go as large as a few megabytes but the goal is not to store any geodata.

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo


“Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free.”
http://p.sf.net/sfu/SauceLabs


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Complexity is one thing, and I ran into some trouble getting blobs to stream using H2, at least the old version of H2 we’re using. That was one of the reasons I was interested in updating H2 earlier.

At worst that could be handled as a dialect specific thing, or I might build it so that the choice is there as part of the dialect support class even if I go with in table storage for my own implementations of H2 and Postgres. A BLOB based Postgres or Oracle implementation could then be added later.

···

On 30 April 2014 09:52, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi Kevin,
for the files we have today I agree this should not be a concern… in the future, nobody
knows what new modules might need.
A bit concerned about the table size limits, Oracle might have issues with creating a table
that is too large without falling back on blobs (which are often stored outside of the table record).

Why can’t we use blob and provide a stream? Just because it’s more complex?
Or we would not be able to use streams with blobs regardless?

Cheers
Andrea

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo

On Wed, Apr 30, 2014 at 6:46 PM, Kevin Smith <ksmith@anonymised.com> wrote:

I’ve been considering how to store the content of resources for a JDBC backed implementation of the new ResourceStore API.

It comes down to BLOB vs BINARY/bytea. On the database side, both H2 and Postgres support large enough in table types that BLOBs should not be needed and add complexity. Both DBMS’s drivers (at least the versions we’re using) however require loading the entire content into memory at once rather than streaming. This should be brief before the content is cached to disk and most individual resources are fairly small.

Some styling resources like large TTFs or images might go as large as a few megabytes but the goal is not to store any geodata.

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo


“Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free.”
http://p.sf.net/sfu/SauceLabs


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Wed, Apr 30, 2014 at 7:15 PM, Kevin Smith <ksmith@anonymised.com>wrote:

Complexity is one thing, and I ran into some trouble getting blobs to
stream using H2, at least the old version of H2 we're using. That was one
of the reasons I was interested in updating H2 earlier.

At worst that could be handled as a dialect specific thing, or I might
build it so that the choice is there as part of the dialect support class
even if I go with in table storage for my own implementations of H2 and
Postgres. A BLOB based Postgres or Oracle implementation could then be
added later.

Sounds like a good idea

Cheers
Andrea

--

Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Kevin with the upgrade to Java 7 almost ready - are we in position to upgrade h2 now?

···

Jody Garnett

On Thu, May 1, 2014 at 3:15 AM, Kevin Smith <ksmith@anonymised.com> wrote:

Complexity is one thing, and I ran into some trouble getting blobs to stream using H2, at least the old version of H2 we’re using. That was one of the reasons I was interested in updating H2 earlier.

At worst that could be handled as a dialect specific thing, or I might build it so that the choice is there as part of the dialect support class even if I go with in table storage for my own implementations of H2 and Postgres. A BLOB based Postgres or Oracle implementation could then be added later.


“Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free.”
http://p.sf.net/sfu/SauceLabs


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On 30 April 2014 09:52, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi Kevin,
for the files we have today I agree this should not be a concern… in the future, nobody
knows what new modules might need.
A bit concerned about the table size limits, Oracle might have issues with creating a table
that is too large without falling back on blobs (which are often stored outside of the table record).

Why can’t we use blob and provide a stream? Just because it’s more complex?
Or we would not be able to use streams with blobs regardless?

Cheers
Andrea

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo

On Wed, Apr 30, 2014 at 6:46 PM, Kevin Smith <ksmith@anonymised.com> wrote:

I’ve been considering how to store the content of resources for a JDBC backed implementation of the new ResourceStore API.

It comes down to BLOB vs BINARY/bytea. On the database side, both H2 and Postgres support large enough in table types that BLOBs should not be needed and add complexity. Both DBMS’s drivers (at least the versions we’re using) however require loading the entire content into memory at once rather than streaming. This should be brief before the content is cached to disk and most individual resources are fairly small.

Some styling resources like large TTFs or images might go as large as a few megabytes but the goal is not to store any geodata.

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo


“Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free.”
http://p.sf.net/sfu/SauceLabs


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Thu, May 1, 2014 at 1:53 AM, Jody Garnett <jody.garnett@anonymised.com> wrote:

Kevin with the upgrade to Java 7 almost ready - are we in position to
upgrade h2 now?

H2 upgrade is not blocked by the Java version, the reasons are:
* we need to upgrade geotools/geoserver and geowebcache all at the same time
* we need to ensure a working upgrade path for the H2 databases that people
already have on disk

There are threads discussing the upgrade issues more in details, search the
archives

Cheers
Andrea

--

Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------