[Geoserver-devel] Chipping away at H2 v 1.x dependencies

Hi all,
as you probably know, GeoServer has a core dependency on H2 database version 1.1.119.
This is slowly becoming troublesome for a few reasons:

  • The 1.x series of H2 is abandoned
  • It’s accumulating CVEs, while none of them seems to be affecting our current usage, they still show up in all dependencies reports
    So it would be a good idea to upgrade. H2 version 2 is supported, but it poses serious upgrade problems: the format of the database changed, and H2 has no compatibility with 1.x databases. The only way to upgrade is to dump SQL using the older version, and then restore with the newer one.
    In addition, H2 retained the same package names, meaning we cannot have both in the same classpath (unless we use shading).
    To make things worse, H2 1.x did not have spatial data types, so we have WKB stored in the database as binary columns, with spatial indexing provided by extra libraries that also happen to be dead (hatbox, geodb).

In GeoSolutions we have been trying to organize an upgrade for a while, but it turns out it’s too much work to be done in one shot (clear funding issue).

However, there is an easier path… chipping away at it a few dependencies at a time. Turns out some of the dependencies towards H2 are core, and others are in plugins. The one that really need a spatial database are just in plugins, while the core dependencies need only an alphanumeric database:

  • GWC disk quota mechanism
  • KML superoverlay support (is anyone still using it? :grin:

In both cases the databases collect caches of information that can be rebuilt automatically by GeoServer as needed, so no actual migration procedure is needed: we can drop the old database, and start over with a new one.

Disk quota over H2 is not recommended for production environments, but still, to keep our “ease of use” story going, having an embedded database would be a good thing.

So what we propose is easy: for these two use cases we switch the embedded database usage to HSQL, which has been modestly, but steadily, serving our CRS database needs for years, without causing trouble. It’s already in our classpath, it has been used for a long time, it’s pure java, and it’s small (1.6MB).

Alternatives considered and discarded for the task at hand:

  • Sqlite/Geopackage: the sqlite jdbc library is a beast (12.2 MB and growing) and not part of the GWC dependencies.
  • H2 v 2.x, because it would mean shading it and right now we’d have to either shade all other usages of 1.x (can’t cover that) or shade 2.x (which is the opposite of the direction we’re going)
    Removing the need for H2 In the core will also allow the possibility of running GeoServer along with the H2GIS data store (that already depends on H2 v 2.x). The other places where we need an embedded spatial database may be covered, in time, by GeoPackage, until we can wave goodbye to the last usage of H2 v 1.x, but they will be doable one by one and at their own pace: GeoFence, NetCDF store.

Other non spatial cases that are in plugins, which might be migrated later to either GeoPackage or HSQLDB, are wps-jdbc and importer-jdbc (both depending on the DataStore interface) and jdbcstore/jdbcconfig.

One thing that will eventually have to go is the H2 datastore we offer as an extension. I don’t think it has any traction, but we use it for some tests. I’d suggest we start removing it from the downloads, though.

Feedback/opinions? W’d like to start migrating GWC and KML substystem as soon as possible.

And oh, I started a mail discussion rather than writing a proposal because there is one bit in here that is totally against the proposal requirements: having a fully funded plan. What we can offer is a vision/direction and a first stepping stone, but we won’t be able to cover the full elimination of H2 v1.x (as said, too much work to fund it in one shot, and a small audience of interested parties).

Cheers
Andrea

···

GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail

Hi all,
as you probably know, GeoServer has a core dependency on H2 database version 1.1.119.
This is slowly becoming troublesome for a few reasons:

  • The 1.x series of H2 is abandoned
  • It’s accumulating CVEs, while none of them seems to be affecting our current usage, they still show up in all dependencies reports
    So it would be a good idea to upgrade. H2 version 2 is supported, but it poses serious upgrade problems: the format of the database changed, and H2 has no compatibility with 1.x databases. The only way to upgrade is to dump SQL using the older version, and then restore with the newer one.
    In addition, H2 retained the same package names, meaning we cannot have both in the same classpath (unless we use shading).
    To make things worse, H2 1.x did not have spatial data types, so we have WKB stored in the database as binary columns, with spatial indexing provided by extra libraries that also happen to be dead (hatbox, geodb).

In GeoSolutions we have been trying to organize an upgrade for a while, but it turns out it’s too much work to be done in one shot (clear funding issue).

However, there is an easier path… chipping away at it a few dependencies at a time. Turns out some of the dependencies towards H2 are core, and others are in plugins. The one that really need a spatial database are just in plugins, while the core dependencies need only an alphanumeric database:

  • GWC disk quota mechanism
  • KML superoverlay support (is anyone still using it? :grin:

In both cases the databases collect caches of information that can be rebuilt automatically by GeoServer as needed, so no actual migration procedure is needed: we can drop the old database, and start over with a new one.

Disk quota over H2 is not recommended for production environments, but still, to keep our “ease of use” story going, having an embedded database would be a good thing.

So what we propose is easy: for these two use cases we switch the embedded database usage to HSQL, which has been modestly, but steadily, serving our CRS database needs for years, without causing trouble. It’s already in our classpath, it has been used for a long time, it’s pure java, and it’s small (1.6MB).

Alternatives considered and discarded for the task at hand:

  • Sqlite/Geopackage: the sqlite jdbc library is a beast (12.2 MB and growing) and not part of the GWC dependencies.
  • H2 v 2.x, because it would mean shading it and right now we’d have to either shade all other usages of 1.x (can’t cover that) or shade 2.x (which is the opposite of the direction we’re going)
    Removing the need for H2 In the core will also allow the possibility of running GeoServer along with the H2GIS data store (that already depends on H2 v 2.x). The other places where we need an embedded spatial database may be covered, in time, by GeoPackage, until we can wave goodbye to the last usage of H2 v 1.x, but they will be doable one by one and at their own pace: GeoFence, NetCDF store.

Other non spatial cases that are in plugins, which might be migrated later to either GeoPackage or HSQLDB, are wps-jdbc and importer-jdbc (both depending on the DataStore interface) and jdbcstore/jdbcconfig.

One thing that will eventually have to go is the H2 datastore we offer as an extension. I don’t think it has any traction, but we use it for some tests. I’d suggest we start removing it from the downloads, though.

Feedback/opinions? W’d like to start migrating GWC and KML substystem as soon as possible.

And oh, I started a mail discussion rather than writing a proposal because there is one bit in here that is totally against the proposal requirements: having a fully funded plan. What we can offer is a vision/direction and a first stepping stone, but we won’t be able to cover the full elimination of H2 v1.x (as said, too much work to fund it in one shot, and a small audience of interested parties).

All this sounds great, my only passing thought re funding is that this is probably a better use of our osgeo funding than removing opengis but I don’t know if we could swap out the end goal of that funding. Is this something we could do in a weekend code sprint in say Kosovo?

Ian

···

Ian Turton

Eh… no. By my current estimates opening a transparent/automated upgrade path, the type that does not involve users in complex dump
and restore operations, would take a few weeks. That’s why I’m pushing for doing things in steps, it’s easier to fund
work one week at a time, for a specific topic, rather than trying to do everything in one shot, with a counterpart that is
interested in only 20-30% of what needs to be done (e…g, NetCDF? But I don’t use it! Or… GeoFence… same…).

Besides… getting rid of the periodic complaints about org.opengis is … priceless :joy:. Don’t know about it,
but personally I had enough for a whole lifetime.

Cheers
Andrea

···

GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail

I like the approach, makes sense. It is a bit ironic moving from H2 → HSQL but I understand why.

As for shading, since the h2 1.x is abandoned … can we just grab the code and compile with a slightly different package name? Or are the package names baked into the data format via serialization or something?

···


Jody Garnett

Ian:

The remove opengis funding is set-up as a cross project thing (not specifically allocated to a project budget). That said if it does not meet its funding target; or if we cannot attract participants even with money there is no obligation to proceed.

I know my own employer does not see the value (so I will not be working on the activity even with funding…). I may still help out as a volunteer; if only so Andrea can enjoy life.

···


Jody Garnett

On Mon, 15 May 2023, 18:31 Jody Garnett, <jody.garnett@anonymised.com> wrote:

Ian:

The remove opengis funding is set-up as a cross project thing (not specifically allocated to a project budget). That said if it does not meet its funding target; or if we cannot attract participants even with money there is no obligation to proceed.

I know my own employer does not see the value (so I will not be working on the activity even with funding…). I may still help out as a volunteer; if only so Andrea can enjoy life.

I’m currently negotiating taking 2 weeks leave for the Bolsena sprint, so if we’re not going ahead that please let me know sooner rather than later, so I can plan to take Lesley somewhere sunny instead.

Ian


Jody Garnett

On Mon, May 15, 2023 at 4:03 AM Ian Turton <ijturton@anonymised.com> wrote:

On Mon, 15 May 2023 at 11:32, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi all,
as you probably know, GeoServer has a core dependency on H2 database version 1.1.119.
This is slowly becoming troublesome for a few reasons:

  • The 1.x series of H2 is abandoned
  • It’s accumulating CVEs, while none of them seems to be affecting our current usage, they still show up in all dependencies reports
    So it would be a good idea to upgrade. H2 version 2 is supported, but it poses serious upgrade problems: the format of the database changed, and H2 has no compatibility with 1.x databases. The only way to upgrade is to dump SQL using the older version, and then restore with the newer one.
    In addition, H2 retained the same package names, meaning we cannot have both in the same classpath (unless we use shading).
    To make things worse, H2 1.x did not have spatial data types, so we have WKB stored in the database as binary columns, with spatial indexing provided by extra libraries that also happen to be dead (hatbox, geodb).

In GeoSolutions we have been trying to organize an upgrade for a while, but it turns out it’s too much work to be done in one shot (clear funding issue).

However, there is an easier path… chipping away at it a few dependencies at a time. Turns out some of the dependencies towards H2 are core, and others are in plugins. The one that really need a spatial database are just in plugins, while the core dependencies need only an alphanumeric database:

  • GWC disk quota mechanism
  • KML superoverlay support (is anyone still using it? :grin:

In both cases the databases collect caches of information that can be rebuilt automatically by GeoServer as needed, so no actual migration procedure is needed: we can drop the old database, and start over with a new one.

Disk quota over H2 is not recommended for production environments, but still, to keep our “ease of use” story going, having an embedded database would be a good thing.

So what we propose is easy: for these two use cases we switch the embedded database usage to HSQL, which has been modestly, but steadily, serving our CRS database needs for years, without causing trouble. It’s already in our classpath, it has been used for a long time, it’s pure java, and it’s small (1.6MB).

Alternatives considered and discarded for the task at hand:

  • Sqlite/Geopackage: the sqlite jdbc library is a beast (12.2 MB and growing) and not part of the GWC dependencies.
  • H2 v 2.x, because it would mean shading it and right now we’d have to either shade all other usages of 1.x (can’t cover that) or shade 2.x (which is the opposite of the direction we’re going)
    Removing the need for H2 In the core will also allow the possibility of running GeoServer along with the H2GIS data store (that already depends on H2 v 2.x). The other places where we need an embedded spatial database may be covered, in time, by GeoPackage, until we can wave goodbye to the last usage of H2 v 1.x, but they will be doable one by one and at their own pace: GeoFence, NetCDF store.

Other non spatial cases that are in plugins, which might be migrated later to either GeoPackage or HSQLDB, are wps-jdbc and importer-jdbc (both depending on the DataStore interface) and jdbcstore/jdbcconfig.

One thing that will eventually have to go is the H2 datastore we offer as an extension. I don’t think it has any traction, but we use it for some tests. I’d suggest we start removing it from the downloads, though.

Feedback/opinions? W’d like to start migrating GWC and KML substystem as soon as possible.

And oh, I started a mail discussion rather than writing a proposal because there is one bit in here that is totally against the proposal requirements: having a fully funded plan. What we can offer is a vision/direction and a first stepping stone, but we won’t be able to cover the full elimination of H2 v1.x (as said, too much work to fund it in one shot, and a small audience of interested parties).

All this sounds great, my only passing thought re funding is that this is probably a better use of our osgeo funding than removing opengis but I don’t know if we could swap out the end goal of that funding. Is this something we could do in a weekend code sprint in say Kosovo?

Ian

Cheers
Andrea

==

GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Ian Turton


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Hi Jody,
shading 1.x or cloning the code would make little difference (I’d rather not have a full source code clone though).
We do have places where the jdbc urls are part of the configuration (disk quota is one, GeoFence has another)
but that’s not the main problem: not having H2 v 1.x in the classpath at all.

Silly dependency checkers might be fooled, but given the potential threat, it’s best to actually see it gone fully if possible,
or at worst, use it only for migration purposes, but not have any trace of usage from normal code paths. In other
words, just eliminate the potential for vulnerabilities to be leveraged.

Cheers
Andrea

···

Regards,

Andrea Aime

==
GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail

I’m planning to be there, but for one week. You can use the other week to visit the area, I have some suggestions that I believe you might enjoy
(I’ve spent a week in the area around Bolsena and my family loved it). Let’s talk privately about it :smiley:

Cheers
Andrea

···

GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail