[Geoserver-devel] GeoWebCache Config UI

Hello all,

during last couple weeks I've been working on embedded GWC
configuration through the UI allowing to configure global caching
defaults, create and edit gridsets, add/remove caching layers
associated to layers and layer groups, and edit the caching properties
on a per layer basis, as well as to bulk configure selected
layers/groups using the global defaults.

Please check the following screen shots to get a better sense of what I mean.

<http://skitch.com/groldan/g1san/01-geowebcache-settings&gt;
<http://skitch.com/groldan/g1saj/02-administer-grid-sets&gt;
<http://skitch.com/groldan/g1s2y/03-viewembdeddedgridset&gt;
<http://skitch.com/groldan/g1s2d/04-create-new-grid-set&gt;
<http://skitch.com/groldan/g1s2q/05-delete-grid-sets&gt;
<http://skitch.com/groldan/g1s2w/06-cachedlayerspage&gt;
<http://skitch.com/groldan/g1s2a/07-truncatewholelayer&gt;
<http://skitch.com/groldan/g1s24/08-stopcachingselectedlayers&gt;
<http://skitch.com/groldan/g1s29/09-bulkconfigcachedlayers&gt;
<http://skitch.com/groldan/g1s3r/10-tilelayer-config-for-layerinfo&gt;
<http://skitch.com/groldan/g1s3j/11-tilelayer-config-for-layergroup&gt;

Now, I'd need to commit that work to svn trunk. Plan is to commit to
trunk only and when/if we get enough community testing and the work
feels solid enough, backport to 2.1.x. API wise there would be no need
to change anything, and I actually have a build against the 2.1.x
branch that I mean to maintain in sync until we're good to port to
2.1.x.

But before committing to trunk I want to ask for approval on a couple things.
First one is an easy one, and is just about a three small changes to
the some core UI classes:

- Allow EnvelopePanel to be extended:
<https://github.com/groldan/geoserver/commit/d298b5d03028711829370cfcf013747185fe2921&gt;
- Allow CRSPanel to be extended:
<https://github.com/groldan/geoserver/commit/ed118eca24debaa0ea760f49a1c7cf26b41be485&gt;
- Allow to set the response page to the resource and layergoup edit
pages: <https://github.com/groldan/geoserver/commit/abfc8a7e3f14f1b4aec82f0a7429404a308075d5&gt;

Second one is about adding a new external dependency. Single 1M library.:
- Add guava (google common libraries) dependency:
<https://github.com/groldan/geoserver/commit/c5197982ea20c741f2dd6afe9ddb0371e6550be4&gt;

I've been working with this library for the most part of the year now
in other projects and it's an excellent, active and well supported
compendium of utilities for the day to day work, similar in spirit to
Apache commons, but more up to date with modern Java concepts and
missing functionalities from Apache commons-*.

Besides, I really expect it to stick with us and encourage you to take
a look at it and consider using it. Of special interest might be its
collection utilities together with functors, a fully configurable
memory cache, and a large number of io, net, and concurrency
utilities.

For instance, I'm using it for scalability reasons, in order to
provide tile layers out of the catalog layers and layer groups
dynamically, by means of a list wrapper and functor object that
creates tile layers out of internal layers on demand.

So, if you have any compelling reason not to include this dependency
on trunk right now please speak soon. I'm also expecting to use it for
other catalog scalability work that's coming down the pipe and for
which I'll start some discussion topics on the list. But if it still
feels like we don't really want another 1M dependency right now I
think I could get rid of it, it would be just weird to have to create
that kind of utility classes myself instead of using a good existing
library.

Thanks in advance for any and all comments,

Gabriel
--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Wed, Dec 21, 2011 at 4:44 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

t. But if it still
feels like we don't really want another 1M dependency right now I
think I could get rid of it, it would be just weird to have to create
that kind of utility classes myself instead of using a good existing
library.

Mind, on its current released version it's 1.5M. My mistake.

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Awesome stuff Gabriel! Screen shots look awesome.

The changes to the core ui classes look ok to me, +1.

As for the new library… can you actually provide more of a context of how you need it / are using it? I am kind of against just adding a new library dependency just for the sake of using it since it overlaps with existing utility libraries. Now if it meets a need that is not being met then that is different. Again more context will help.

-Justin

On Wed, Dec 21, 2011 at 12:46 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

On Wed, Dec 21, 2011 at 4:44 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

t. But if it still
feels like we don’t really want another 1M dependency right now I
think I could get rid of it, it would be just weird to have to create
that kind of utility classes myself instead of using a good existing
library.

Mind, on its current released version it’s 1.5M. My mistake.

Gabriel


Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.


Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hey Justin,

On Thu, Dec 22, 2011 at 12:20 PM, Justin Deoliveira
<jdeolive@anonymised.com> wrote:

Awesome stuff Gabriel! Screen shots look awesome.

thanks

The changes to the core ui classes look ok to me, +1.

thanks

As for the new library... can you actually provide more of a context of how
you need it / are using it? I am kind of against just adding a new library
dependency just for the sake of using it since it overlaps with existing
utility libraries. Now if it meets a need that is not being met then that is
different. Again more context will help.

I share the concern on adding new dependencies.
There's definitely functionality overlap between commons-* and guava,
and stuff that's in one that is not in the other. I'm not using it
just for the sake of using it, guaranteed. Actually found it while
looking for something that let me do what I needed to and didn't find
it in our current dependencies. After having coding it myself and
deciding it was so much a common pattern that it had to exist already
in a library (more info bellow).
There's a lot of documentation and discussion on which one is better
on the web so I won't get into that. In my opinion guava fills a lot
of gaps in apache commons, plus provides a modern and concise api.
There're a couple links of interest at the bottom of this message.
Suffice to say in gwc integration I'm using it specifically for stuff
that is not available in our current dependencies _afaik_. For
instance, I'm using the functional programming style utilities,
although if the dependency were in there I'd use it for a lot more
things.
Current use is limited though, and as I mentioned before I could get
rid of it by coding some utility classes myself, it'd be just a waste.
For instance, I'm using the Function and Predicate interfaces together
with its integration with the collection utilities to provide a run
time transformed view of a union of catalog's layer infos and
layergroups as a single list of tile layers. Otherwise I would have to
either create a memory collection of geoserver tile layers each time,
or implement myself the needed idioms for union and runtime adaptation
of catalog objects into GeoServerTileLayer. (there's
commons-collections LazyList, but it decorates based on index where I
need to decorate an iterator/iterable).

So yeah, it is not that much of a big deal, yet. But due to these
functional programming style utilities, plus some others that I fell
in love with, I'm planning on making it a dependency on gwc itself,
and will surely be around for any scalability work that's to come.

So although it'd be easier for me to just get rid of it right now, and
push for it when usage needs are more extensive, thought I should ask
the dev community anyway and have less code to maintain.

So I'll leave it up to vote by now. Don't hesitate to vote against.
Just beware I'll come back :stuck_out_tongue:

Gab

<http://code.google.com/p/guava-libraries/wiki/CollectionUtilitiesExplained&gt;
<http://code.google.com/p/guava-libraries/wiki/FunctionalExplained&gt;
<http://code.google.com/p/guava-libraries/wiki/CachesExplained&gt;
<http://stackoverflow.com/questions/4542550/what-are-the-big-improvements-between-guava-and-apache-equivalent-libraries&gt;

-Justin

On Wed, Dec 21, 2011 at 12:46 PM, Gabriel Roldan <groldan@anonymised.com>
wrote:

On Wed, Dec 21, 2011 at 4:44 PM, Gabriel Roldan <groldan@anonymised.com>
wrote:
> t. But if it still
> feels like we don't really want another 1M dependency right now I
> think I could get rid of it, it would be just weird to have to create
> that kind of utility classes myself instead of using a good existing
> library.
Mind, on its current released version it's 1.5M. My mistake.

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Could you explain ‘some others that I fell in love with’ What is the functionality that you really like that it has that is lacking in commons? If we’re bringing it in it’d be really good to have others very aware of what can be done with it and have more people using it, not just you.

And would there be possibility to cut out part or all of commons eventually? It seems like it’d be good to have less ways to do the same things in the GS codebase.

It may be extreme, but it seems like it might not be crazy to have a GSIP for the proposal to change, laying out the advantages and downsides. Indeed I think it could make more sense to do that on these kind of technical decisions, so new developers can come in and see why we do things a certain way. What embedded java database we use also jumps to mind - new ones are fine, but we should be on a clear technical path to get to one tech option.

Though it would probably never be written it’d be nice to at least aim towards an imaginary developers guide that sets the standards for what libraries and technologies we use, so that a new person looking at one part of the codebase could easily grok another.

On Thu, Dec 22, 2011 at 12:46 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

Hey Justin,

On Thu, Dec 22, 2011 at 12:20 PM, Justin Deoliveira
<jdeolive@anonymised.com> wrote:

Awesome stuff Gabriel! Screen shots look awesome.

thanks

The changes to the core ui classes look ok to me, +1.

thanks

As for the new library… can you actually provide more of a context of how
you need it / are using it? I am kind of against just adding a new library
dependency just for the sake of using it since it overlaps with existing
utility libraries. Now if it meets a need that is not being met then that is
different. Again more context will help.

I share the concern on adding new dependencies.
There’s definitely functionality overlap between commons-* and guava,
and stuff that’s in one that is not in the other. I’m not using it
just for the sake of using it, guaranteed. Actually found it while
looking for something that let me do what I needed to and didn’t find
it in our current dependencies. After having coding it myself and
deciding it was so much a common pattern that it had to exist already
in a library (more info bellow).
There’s a lot of documentation and discussion on which one is better
on the web so I won’t get into that. In my opinion guava fills a lot
of gaps in apache commons, plus provides a modern and concise api.
There’re a couple links of interest at the bottom of this message.
Suffice to say in gwc integration I’m using it specifically for stuff
that is not available in our current dependencies afaik. For
instance, I’m using the functional programming style utilities,
although if the dependency were in there I’d use it for a lot more
things.
Current use is limited though, and as I mentioned before I could get
rid of it by coding some utility classes myself, it’d be just a waste.
For instance, I’m using the Function and Predicate interfaces together
with its integration with the collection utilities to provide a run
time transformed view of a union of catalog’s layer infos and
layergroups as a single list of tile layers. Otherwise I would have to
either create a memory collection of geoserver tile layers each time,
or implement myself the needed idioms for union and runtime adaptation
of catalog objects into GeoServerTileLayer. (there’s
commons-collections LazyList, but it decorates based on index where I
need to decorate an iterator/iterable).

So yeah, it is not that much of a big deal, yet. But due to these
functional programming style utilities, plus some others that I fell
in love with, I’m planning on making it a dependency on gwc itself,
and will surely be around for any scalability work that’s to come.

So although it’d be easier for me to just get rid of it right now, and
push for it when usage needs are more extensive, thought I should ask
the dev community anyway and have less code to maintain.

So I’ll leave it up to vote by now. Don’t hesitate to vote against.
Just beware I’ll come back :stuck_out_tongue:

Gab

<http://code.google.com/p/guava-libraries/wiki/CollectionUtilitiesExplained>
<http://code.google.com/p/guava-libraries/wiki/FunctionalExplained>
<http://code.google.com/p/guava-libraries/wiki/CachesExplained>
<http://stackoverflow.com/questions/4542550/what-are-the-big-improvements-between-guava-and-apache-equivalent-libraries>

-Justin

On Wed, Dec 21, 2011 at 12:46 PM, Gabriel Roldan <groldan@anonymised.com>
wrote:

On Wed, Dec 21, 2011 at 4:44 PM, Gabriel Roldan <groldan@anonymised.com>
wrote:

t. But if it still
feels like we don’t really want another 1M dependency right now I
think I could get rid of it, it would be just weird to have to create
that kind of utility classes myself instead of using a good existing
library.
Mind, on its current released version it’s 1.5M. My mistake.

Gabriel


Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.


Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.


Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.


Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On Thu, Dec 22, 2011 at 7:46 PM, Chris Holmes <cholmes@anonymised.com.> wrote:

Could you explain ‘some others that I fell in love with’ What is the functionality that you really like that it has that is lacking in commons? If we’re bringing it in it’d be really good to have others very aware of what can be done with it and have more people using it, not just you.

And would there be possibility to cut out part or all of commons eventually? It seems like it’d be good to have less ways to do the same things in the GS codebase.

It may be extreme, but it seems like it might not be crazy to have a GSIP for the proposal to change, laying out the advantages and downsides. Indeed I think it could make more sense to do that on these kind of technical decisions, so new developers can come in and see why we do things a certain way. What embedded java database we use also jumps to mind - new ones are fine, but we should be on a clear technical path to get to one tech option.

Very much agreed. Large new dependencies should be agreed upon by the larger community, a GSIP is a nice way
to get that done, gives space to make an argument for the library.

As for getting rid of commons… I was thinking the same but we probably won’t be able to.
Not because I like them, but because we depend on libraries that do in turn depend on them.

However it would be nice to have, in the proposal, something that shows some common usage of
commons in GeoServer/GeoTools and how that can be replaced with Guava code instead,
or some common bits of hand crafted code we already have that could be made better by
using Guava.
Who knows, peoples might like it so much that we could make a sprint on trunk and make an
effort to use it wherever it makes sense.

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays from 27/12 to 30/12


Ok, as it turns out I'll get rid of the dependency for the time being
and code the utilities I need, or I'll never be able of achieving my
main goal which is getting the gwc improvements in.

more comments inline.

On Fri, Dec 23, 2011 at 5:47 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Thu, Dec 22, 2011 at 7:46 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Could you explain 'some others that I fell in love with' What is the
functionality that you really like that it has that is lacking in commons?

One functionality worth pursuing IMHO is the fully configurable memory
cache [1].
Take for example our current memory caches in ResourcePool, and
compare with this one:
<https://github.com/groldan/geoserver/blob/guava_resourcepool/src/main/src/main/java/org/geoserver/catalog/ResourcePool.java&gt;
Here's the diff:
<https://github.com/groldan/geoserver/commit/b66df1c84ffd43846c12432255617cbdb3d85816#diff-1&gt;

What it does it to do a lot more with a bit less. The patch cuts out
25 lines of code out of ResourcePool. Doesn't seem like much. But in
return it:
- bases _all_ memory caches on an interface specifically defined for
caching, instead of cooking our own, and using HashMap for some
caches, and WeakHashMap subclass for others
- allows all memory caches to have: capacity bound, expiration based
on last access time or last read time, ability to use weak keys
and/or soft value references, concurrency hints ( the table is
internally partitioned to try to permit the indicated number of
concurrent updates without contention.)
- Encapsulates the cache population logic and entry removal hooks
(like for disposing a resource being evicted) into a single object, so
related logic remains close.
- Eliminates the need for the "double checked logic anti-pattern", so
that every get method on cacheable contents becomes basically:
    public void getFoo(someKey){
       return fooCache.get(someKey);
    }

instead of

  public void getFoo(someKey){
     Foo foo = fooCache.get(someKey);
     if( foo == null ){
        synchronized(fooCache){
          foo = fooCache.get(someKey);
          if( foo == null ){
             foo = ....
             fooCache.put(someKey, foo);
          }
        }
     }
     return foo;
  }

[1]<http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/cache/CacheBuilder.html&gt;

Some other functionality I fell in love with is
- functional style programming constructs through predicate and function.
- All collections implementations faithfully respect the
java.collections contracts
- Better collections utilities, including Iterable and Iterator
transforms and filtering. GeoGit uses this extensively for scalability
purposes.

Being a compendium of utilities, it's major advantages are maybe not
strictly on the functional side. On the non functional side of the
fence we can cite:
- Small, concise, Java5 API with full support for generics (commons-*
targets Java 1.4 at the best)
- some others better explained:
<http://stackoverflow.com/questions/4542550/what-are-the-big-improvements-between-guava-and-apache-equivalent-libraries&gt;

If we're bringing it in it'd be really good to have others very aware of
what can be done with it and have more people using it, not just you.

And would there be possibility to cut out part or all of commons
eventually? It seems like it'd be good to have less ways to do the same
things in the GS codebase.

Part of it could be. All of it not. Because they're actually different
libraries. If we're talking of commons-collections or commons-io
perhaps most of what we use can be replaced by guava. There's a lot
more about commons-* though, and there's stuff guava provides that
commons-* doesn't.

It may be extreme, but it seems like it might not be crazy to have a GSIP
for the proposal to change, laying out the advantages and downsides. Indeed
I think it could make more sense to do that on these kind of technical
decisions, so new developers can come in and see why we do things a certain
way.

I understand the reasoning, though I'm skeptical of its applicability.
By one side, I can't remember any precedent on such a big requirement
to justify a new dependency. By the other side, any non trivial
endeavor that affects big part of the code base is unlikely to be
properly finished without proper resource investment (take the move
from geotools legacy filter api to geoapi filter as an example).

What embedded java database we use also jumps to mind - new ones are
fine, but we should be on a clear technical path to get to one tech option.

One tech option arguable could serve all needs. If we're talking about
deciding on one SQL embedded database that's ok. But again an SQL
database arguably could serve all needs. In the specific case of BDB
JE for gwc, research has been made before choosing it and the current
metastore based on H2 didn't scale well. It's selection seems to have
been based on a microbenchmark that's still among the test cases that
checks it can insert 2000 records "quick". When it comes to large
caches it just bogs down, partially because of the SQL overhead
(record look up by 5 or 6 fields), and partially because of weak
implementation logic (record locking based on updating a field value
and other issues). All in all, I decided the sql layer was unnecesary
overhead when all it's needed it a key based lookup, and no SQL power
features are needed. It'd be good though to note BDB JE benefits
somewhere for when all you need is a well performing key/value
database.

Very much agreed. Large new dependencies should be agreed upon by the larger
community, a GSIP is a nice way
to get that done, gives space to make an argument for the library.

As for getting rid of commons... I was thinking the same but we probably
won't be able to.
Not because I like them, but because we depend on libraries that do in
turn depend on them.

yet another reason, that could be solved by making those transitive
dependencies hard ones. Yet I restate guave is not meant as a
replacement for commons-*, although the most common intersection point
is commons-collections, where it _could_ be a more modern and
efficient alternative (as per what google claims), but research should
be done to actually assert that.

However it would be nice to have, in the proposal, something that shows some
common usage of
commons in GeoServer/GeoTools and how that can be replaced with Guava code
instead,

Again, commons is a big word. We have strong dependencies on
commons-beanutils/digester/fileupload/io/httpclient/lang/validator/logging/pool/collections.

Guava fills some gaps wrt io, concurrency, cache, collections, lang,
plus stuff not in the above list and Java5. Does not replace
"commons".

or some common bits of hand crafted code we already have that could be made
better by
using Guava.

Take the caching example above as an example for this.

Cheers,
Gabriel

Who knows, peoples might like it so much that we could make a sprint on
trunk and make an
effort to use it wherever it makes sense.

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays
from 27/12 to 30/12

-------------------------------------------------------

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Fri, Dec 23, 2011 at 1:23 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

Ok, as it turns out I’ll get rid of the dependency for the time being
and code the utilities I need, or I’ll never be able of achieving my
main goal which is getting the gwc improvements in.

Sorry if this came across as discouragement - it wasn’t meant to. Mostly I’d just like more discussion of this kind so everyone knows about new tech and new possibilities. And sorry if the GSIP request seems to add overhead, the goal behind that was just to get that discussion in a more stable place that others could look at in the future without having to read the full list history. The GSIP could be pretty simple, mostly just saying the same stuff you did here.

Some responses inline, but just this level of dialog I think is really good, and please don’t get discouraged - it’s not a resistance to new tech and ideas, just a desire to understand.

more comments inline.

On Fri, Dec 23, 2011 at 5:47 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Thu, Dec 22, 2011 at 7:46 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Could you explain ‘some others that I fell in love with’ What is the
functionality that you really like that it has that is lacking in commons?

One functionality worth pursuing IMHO is the fully configurable memory
cache [1].
Take for example our current memory caches in ResourcePool, and

Being a compendium of utilities, it’s major advantages are maybe not
strictly on the functional side. On the non functional side of the
fence we can cite:

Thanks, that’s all super helpful.

If we’re bringing it in it’d be really good to have others very aware of
what can be done with it and have more people using it, not just you.

And would there be possibility to cut out part or all of commons
eventually? It seems like it’d be good to have less ways to do the same
things in the GS codebase.

Part of it could be. All of it not. Because they’re actually different
libraries. If we’re talking of commons-collections or commons-io
perhaps most of what we use can be replaced by guava. There’s a lot
more about commons-* though, and there’s stuff guava provides that
commons-* doesn’t.

Cool. But it sounds like commons is split up in to different pieces, so we could work towards removing some of them eventually. I don’t think we need to immediate remove stuff to get guava in, but should have a recommendation to GS devs for what to use when.

It may be extreme, but it seems like it might not be crazy to have a GSIP
for the proposal to change, laying out the advantages and downsides. Indeed
I think it could make more sense to do that on these kind of technical
decisions, so new developers can come in and see why we do things a certain
way.

I understand the reasoning, though I’m skeptical of its applicability.
By one side, I can’t remember any precedent on such a big requirement
to justify a new dependency. By the other side, any non trivial
endeavor that affects big part of the code base is unlikely to be
properly finished without proper resource investment (take the move
from geotools legacy filter api to geoapi filter as an example).

Yeah, I’ll admit there’s no precedent. But I think it’d be nice - again, not a challenge, and I’m fairly convinced at this point that it’d be good to add. Just that in general we should start justifying new libraries we bring in to core.

I also don’t see this as requiring a big resource investment - like it’d just be a decision to move in that direction. No immediate changes needed, but future coding is recommended in the newer way.

Like to me it’s the difference between bringing guava in and having just you use it because no one knows what it does vs. socializing the idea so that others start to use it in the future. They’re not forced to convert their code over, but can refer to the GSIP when they’re doing some new coding where it might have some useful functionality.

What embedded java database we use also jumps to mind - new ones are
fine, but we should be on a clear technical path to get to one tech option.

One tech option arguable could serve all needs. If we’re talking about
deciding on one SQL embedded database that’s ok. But again an SQL
database arguably could serve all needs. In the specific case of BDB
JE for gwc, research has been made before choosing it and the current
metastore based on H2 didn’t scale well. It’s selection seems to have
been based on a microbenchmark that’s still among the test cases that
checks it can insert 2000 records “quick”. When it comes to large
caches it just bogs down, partially because of the SQL overhead
(record look up by 5 or 6 fields), and partially because of weak
implementation logic (record locking based on updating a field value
and other issues). All in all, I decided the sql layer was unnecesary
overhead when all it’s needed it a key based lookup, and no SQL power
features are needed. It’d be good though to note BDB JE benefits
somewhere for when all you need is a well performing key/value
database.

I’d love to break this conversation out to a GSIP. My point is really that we should examine if we actually need an embedded SQL database in GeoServer, or if we could accomplish everything that we want with BDB JE. Like I’d prefer you laying out all its advantages, and having the PSC discuss if we should make it the recommended way for GeoServer devs who need an embedded data base. I don’t know the code too well at this point, but I can’t think of anything we do where SQL seems needed - there were some H2/hibernate experiments, but a kvp store seems better there than having to go through SQL. The old HSQL epsg thing seems like it could probably be nosql. And I’ve been pretty down on H2 lately, as I’ve heard that it’s the cause of some bad data integrity issues. And I was the one who used to recommend it a lot.

Very much agreed. Large new dependencies should be agreed upon by the larger
community, a GSIP is a nice way
to get that done, gives space to make an argument for the library.

As for getting rid of commons… I was thinking the same but we probably
won’t be able to.
Not because I like them, but because we depend on libraries that do in
turn depend on them.

yet another reason, that could be solved by making those transitive
dependencies hard ones. Yet I restate guave is not meant as a
replacement for commons-*, although the most common intersection point
is commons-collections, where it could be a more modern and
efficient alternative (as per what google claims), but research should
be done to actually assert that.

Cool, and yeah, that’s just exactly what I’d look for in a gsip, that discussion and research.

Thanks for all the good info Gabriel, it’s just super helpful to understand what a new, useful dependency might bring in. If you’d like I can help you make a GSIP for guava, as I do think it’s GSIP worthy, to get buy in from devs to actually using it, instead of just a random utility that few use.

C

However it would be nice to have, in the proposal, something that shows some
common usage of
commons in GeoServer/GeoTools and how that can be replaced with Guava code
instead,

Again, commons is a big word. We have strong dependencies on
commons-beanutils/digester/fileupload/io/httpclient/lang/validator/logging/pool/collections.

Guava fills some gaps wrt io, concurrency, cache, collections, lang,
plus stuff not in the above list and Java5. Does not replace
“commons”.

or some common bits of hand crafted code we already have that could be made
better by
using Guava.

Take the caching example above as an example for this.

Cheers,
Gabriel

Who knows, peoples might like it so much that we could make a sprint on
trunk and make an
effort to use it wherever it makes sense.

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays
from 27/12 to 30/12



Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Fri, Dec 23, 2011 at 7:23 PM, Gabriel Roldan <groldan@anonymised.com…> wrote:

Ok, as it turns out I’ll get rid of the dependency for the time being
and code the utilities I need, or I’ll never be able of achieving my
main goal which is getting the gwc improvements in.

No need to jump the gun on this one, I for one was actually inclined to say that
including guava was ok, provided there was a bit of introduction and explanation
for it for everybody to understand its potential.

I already turned that library down two times in the past year and it keeps
on coming back, wanted to use one of its concurrent multimaps once and
the MapMaker ability to create a concurrent soft hash map another time.

more comments inline.

On Fri, Dec 23, 2011 at 5:47 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Thu, Dec 22, 2011 at 7:46 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Could you explain ‘some others that I fell in love with’ What is the
functionality that you really like that it has that is lacking in commons?

One functionality worth pursuing IMHO is the fully configurable memory
cache [1].
Take for example our current memory caches in ResourcePool, and
compare with this one:
<https://github.com/groldan/geoserver/blob/guava_resourcepool/src/main/src/main/java/org/geoserver/catalog/ResourcePool.java>
Here’s the diff:
<https://github.com/groldan/geoserver/commit/b66df1c84ffd43846c12432255617cbdb3d85816#diff-1>

This is actually one of the cases that made me look at Guava already in the past.

It may be extreme, but it seems like it might not be crazy to have a GSIP
for the proposal to change, laying out the advantages and downsides. Indeed
I think it could make more sense to do that on these kind of technical
decisions, so new developers can come in and see why we do things a certain
way.

I understand the reasoning, though I’m skeptical of its applicability.
By one side, I can’t remember any precedent on such a big requirement
to justify a new dependency. By the other side, any non trivial
endeavor that affects big part of the code base is unlikely to be
properly finished without proper resource investment (take the move
from geotools legacy filter api to geoapi filter as an example).

To be blunt, you laid your bed for this one by sneaking Berkely Embedded Java
Edition on our back when you implemented the GWC disk quota mechanism.
On trunk it’s our largest dependency, accounting for 5% of the download size
alone, and it was introduced just like that, without any prior discussion
(the fact that you did research does not mean much since you did not allow
anyone else to express an opinion on it before going down and use it).

Since then the only point of discussion I had with my customers about disk
quota (something that I was really looking forward to have) is how to turn
it off, since it breaks clustering.
A bit of preventive discussion might have prevented both the bad feelings
and having a functionality that is nothing but an annoyance beyond the
single server install.

We all try to work togheter and all significant changes around here go through
some bits of discussion on the mailing list, if not a GSIP, before the
code is implemented.
An approach as “here is what I have, take or leave” does not help
building agreement on a solution.

Anyways, closing up, I’m ok with having guava around and will also
put on the table a couple of days out of my weekends to have guava better used in our
code base provided you take the time to talk with the community about it
with examples from our code base, or from geogit, of how it would improve things
(either gsip or decent introducion on the mailing list, that’s the same for me).

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays from 27/12 to 30/12


On Fri, Dec 23, 2011 at 5:10 PM, Chris Holmes <cholmes@anonymised.com> wrote:

On Fri, Dec 23, 2011 at 1:23 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

Ok, as it turns out I'll get rid of the dependency for the time being
and code the utilities I need, or I'll never be able of achieving my
main goal which is getting the gwc improvements in.

Sorry if this came across as discouragement - it wasn't meant to. Mostly
I'd just like more discussion of this kind so everyone knows about new tech
and new possibilities. And sorry if the GSIP request seems to add overhead,
the goal behind that was just to get that discussion in a more stable place
that others could look at in the future without having to read the full list
history. The GSIP could be pretty simple, mostly just saying the same stuff
you did here.

Hey Chris, no need to apologize.
I meant it when I said I was ok in stepping back from including guava
right away, as I did expect some discussion to arise, then I could
replace the custom code when it gets wider approval.

I did feel like a GSIP for replacing in one shot all uses of commons
where guava could do better would be overkill. After a second read and
relaxing holiday I get it that's not exactly a requirement, so now
feel better about an introductory like GSIP.

Sorry my message came out as rough.

Gabriel

Some responses inline, but just this level of dialog I think is really good,
and please don't get discouraged - it's not a resistance to new tech and
ideas, just a desire to understand.

more comments inline.

On Fri, Dec 23, 2011 at 5:47 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:
> On Thu, Dec 22, 2011 at 7:46 PM, Chris Holmes <cholmes@anonymised.com>
> wrote:
>>
>> Could you explain 'some others that I fell in love with' What is the
>> functionality that you really like that it has that is lacking in
>> commons?

One functionality worth pursuing IMHO is the fully configurable memory
cache [1].
Take for example our current memory caches in ResourcePool, and

...

Being a compendium of utilities, it's major advantages are maybe not
strictly on the functional side. On the non functional side of the
fence we can cite:
- Small, concise, Java5 API with full support for generics (commons-*
targets Java 1.4 at the best)
- some others better explained:

<http://stackoverflow.com/questions/4542550/what-are-the-big-improvements-between-guava-and-apache-equivalent-libraries&gt;

Thanks, that's all super helpful.

>> If we're bringing it in it'd be really good to have others very aware
>> of
>> what can be done with it and have more people using it, not just you.
>>
>> And would there be possibility to cut out part or all of commons
>> eventually? It seems like it'd be good to have less ways to do the
>> same
>> things in the GS codebase.
Part of it could be. All of it not. Because they're actually different
libraries. If we're talking of commons-collections or commons-io
perhaps most of what we use can be replaced by guava. There's a lot
more about commons-* though, and there's stuff guava provides that
commons-* doesn't.

Cool. But it sounds like commons is split up in to different pieces, so we
could work towards removing some of them eventually. I don't think we need
to immediate remove stuff to get guava in, but should have a recommendation
to GS devs for what to use when.

>>
>> It may be extreme, but it seems like it might not be crazy to have a
>> GSIP
>> for the proposal to change, laying out the advantages and downsides.
>> Indeed
>> I think it could make more sense to do that on these kind of technical
>> decisions, so new developers can come in and see why we do things a
>> certain
>> way.
I understand the reasoning, though I'm skeptical of its applicability.
By one side, I can't remember any precedent on such a big requirement
to justify a new dependency. By the other side, any non trivial
endeavor that affects big part of the code base is unlikely to be
properly finished without proper resource investment (take the move
from geotools legacy filter api to geoapi filter as an example).

Yeah, I'll admit there's no precedent. But I think it'd be nice - again,
not a challenge, and I'm fairly convinced at this point that it'd be good to
add. Just that in general we should start justifying new libraries we bring
in to core.

I also don't see this as requiring a big resource investment - like it'd
just be a decision to move in that direction. No immediate changes needed,
but future coding is recommended in the newer way.

Like to me it's the difference between bringing guava in and having just you
use it because no one knows what it does vs. socializing the idea so that
others start to use it in the future. They're not forced to convert their
code over, but can refer to the GSIP when they're doing some new coding
where it might have some useful functionality.

>>What embedded java database we use also jumps to mind - new ones are
>> fine, but we should be on a clear technical path to get to one tech
>> option.

One tech option arguable could serve all needs. If we're talking about
deciding on one SQL embedded database that's ok. But again an SQL
database arguably could serve all needs. In the specific case of BDB
JE for gwc, research has been made before choosing it and the current
metastore based on H2 didn't scale well. It's selection seems to have
been based on a microbenchmark that's still among the test cases that
checks it can insert 2000 records "quick". When it comes to large
caches it just bogs down, partially because of the SQL overhead
(record look up by 5 or 6 fields), and partially because of weak
implementation logic (record locking based on updating a field value
and other issues). All in all, I decided the sql layer was unnecesary
overhead when all it's needed it a key based lookup, and no SQL power
features are needed. It'd be good though to note BDB JE benefits
somewhere for when all you need is a well performing key/value
database.

I'd love to break this conversation out to a GSIP. My point is really that
we should examine if we actually _need_ an embedded SQL database in
GeoServer, or if we could accomplish everything that we want with BDB JE.
Like I'd prefer you laying out all its advantages, and having the PSC
discuss if we should make it the recommended way for GeoServer devs who need
an embedded data base. I don't know the code too well at this point, but I
can't think of anything we do where SQL seems needed - there were some
H2/hibernate experiments, but a kvp store seems better there than having to
go through SQL. The old HSQL epsg thing seems like it could probably be
nosql. And I've been pretty down on H2 lately, as I've heard that it's the
cause of some bad data integrity issues. And I was the one who used to
recommend it a lot.

>
>
> Very much agreed. Large new dependencies should be agreed upon by the
> larger
> community, a GSIP is a nice way
> to get that done, gives space to make an argument for the library.
>
> As for getting rid of commons... I was thinking the same but we probably
> won't be able to.
> Not because I like them, but because we depend on libraries that do in
> turn depend on them.
yet another reason, that could be solved by making those transitive
dependencies hard ones. Yet I restate guave is not meant as a
replacement for commons-*, although the most common intersection point
is commons-collections, where it _could_ be a more modern and
efficient alternative (as per what google claims), but research should
be done to actually assert that.

Cool, and yeah, that's just exactly what I'd look for in a gsip, that
discussion and research.

Thanks for all the good info Gabriel, it's just super helpful to understand
what a new, useful dependency might bring in. If you'd like I can help you
make a GSIP for guava, as I do think it's GSIP worthy, to get buy in from
devs to actually using it, instead of just a random utility that few use.

C

>
> However it would be nice to have, in the proposal, something that shows
> some
> common usage of
> commons in GeoServer/GeoTools and how that can be replaced with Guava
> code
> instead,
Again, commons is a big word. We have strong dependencies on

commons-beanutils/digester/fileupload/io/httpclient/lang/validator/logging/pool/collections.

Guava fills some gaps wrt io, concurrency, cache, collections, lang,
plus stuff not in the above list and Java5. Does not replace
"commons".

> or some common bits of hand crafted code we already have that could be
> made
> better by
> using Guava.

Take the caching example above as an example for this.

Cheers,
Gabriel

> Who knows, peoples might like it so much that we could make a sprint on
> trunk and make an
> effort to use it wherever it makes sense.
>
> Cheers
> Andrea
>
>
> --
> -------------------------------------------------------
> Ing. Andrea Aime
> GeoSolutions S.A.S.
> Tech lead
>
> Via Poggio alle Viti 1187
> 55054 Massarosa (LU)
> Italy
>
> phone: +39 0584 962313
> fax: +39 0584 962313
> mob: +39 339 8844549
>
> http://www.geo-solutions.it
> http://geo-solutions.blogspot.com/
> http://www.youtube.com/user/GeoSolutionsIT
> http://www.linkedin.com/in/andreaaime
> http://twitter.com/geowolf
>
> Please take note that GeoSolutions will be closed for Christmas holidays
> from 27/12 to 30/12
>
> -------------------------------------------------------
>

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Sat, Dec 24, 2011 at 4:45 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Fri, Dec 23, 2011 at 7:23 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

...

Andrea: in a word, you're right.
Although "sneaking" sounds like intentional bad manners, I assure your
it was not like that.
At the time I was under the wrong impression that a transitive
dependency on a depended project was not that much a big deal. Nor I
was aware it'll break an important use case, which was unfortunate on
my side.
I know we all try to work together on all significant changes around
here. WRT GSIP's _before_ the code is implemented that is not always
possible, specially when some RnD is in place, but that's secondary.
That said, sorry about any inconvenient caused, I assure you since
then I'm trying to engage into more community discussion before any
significant change, gwc related or not. Hence thanks to this debate
I'm into a GSIP to introduce guava.
I'm also taking the opportunity to re-invite you or any other
developer interested in or needing to support gwc to join the gwc
developers community, or more precisely to help build one. Feels
lonely in there.

Cheers,
Gabriel

To be blunt, you laid your bed for this one by sneaking Berkely Embedded
Java
Edition on our back when you implemented the GWC disk quota mechanism.
On trunk it's our largest dependency, accounting for 5% of the download size
alone, and it was introduced just like that, without any prior discussion
(the fact that you did research does not mean much since you did not allow
anyone else to express an opinion on it before going down and use it).

Since then the only point of discussion I had with my customers about disk
quota (something that I was really looking forward to have) is how to turn
it off, since it breaks clustering.
A bit of preventive discussion might have prevented both the bad feelings
and having a functionality that is nothing but an annoyance beyond the
single server install.

We all try to work togheter and all significant changes around here go
through
some bits of discussion on the mailing list, if not a GSIP, _before_ the
code is implemented.
An approach as "here is what I have, take or leave" does not help
building agreement on a solution.

Anyways, closing up, I'm ok with having guava around and will also
put on the table a couple of days out of my weekends to have guava better
used in our
code base provided you take the time to talk with the community about it
with examples from our code base, or from geogit, of how it would improve
things
(either gsip or decent introducion on the mailing list, that's the same for
me).

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays
from 27/12 to 30/12

-------------------------------------------------------

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Mon, Dec 26, 2011 at 3:38 AM, Gabriel Roldan <groldan@anonymised.com…> wrote:

Hey Chris, no need to apologize.
I meant it when I said I was ok in stepping back from including guava
right away, as I did expect some discussion to arise, then I could
replace the custom code when it gets wider approval.

I did feel like a GSIP for replacing in one shot all uses of commons
where guava could do better would be overkill. After a second read and
relaxing holiday I get it that’s not exactly a requirement, so now
feel better about an introductory like GSIP.

I guess we’re mixing in too many things.
The way I see it: a GSIP explaining the benefits of Guava and showing
examples of how it can make things better is how you’d build the
“wider approval”.
Even just a long introductory mail, less formal than a GSIP.

I would not mix in the “replace commons with Guava” part, we already
estabilshed it’s not really feasible, let’s just have an introduction on
how and why Guava might be good for our code base?

In the meantime I’ll try to shave off some other megabytes off
the trunk dependencies, spring-security brings in aspectj jars which
are big and afaik we don’t use, and h2 just has to go, it should not
be too hard to have superoverlays work against hsql instead of h2.
I’ll also have a look again at Xerces and try harder to remove it
(there is one version in the JDK already)

-rw-r–r-- 1 aaime aaime 1,6M 2010-01-25 15:58 aspectjweaver-1.6.8.jar

-rw-r–r-- 1 aaime aaime 1,2M 2010-01-25 10:01 h2-1.1.119.jar

-rw-r–r-- 1 aaime aaime 1,2M 2010-01-25 08:59 xercesImpl-2.7.1.jar

If we can get rid of those three we make more than enough room for
a new 1.5MB dependency that might ease writing new code (however
showing this would be the job of the GSIP/mail).

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

Please take note that GeoSolutions will be closed for Christmas holidays from 27/12 to 30/12