On Wed, Jun 1, 2011 at 4:34 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:
Starting to move stuff out of core is problematic... not only is it a lot of
work to create all the modules, but it does not seem very nice to simply
start removing functionality without some sort of big warning.
Moving to java 6 and trying to prune out dependencies seems viable. But a
bit of work for sure... lots of testing, etc... And we have not yet decided
as a community when to require java 6. Is 2.2 the branch to do it on?
Yep, I agree, that is something that we can do on trunk and it's a discussion
we need to raise again (java 6 or not)
I wonder if a shorter term solution might be just not to ship the data
directory in the war, like we do now. I just did a test with the 2.1.0 war
and removing it brings us (just barely) under the 50M limit. We coudl even
create two artifacts, one with data and one without.
Hum... I guess it might be interesting, as people deploying a .war are probably
beyond the "see what this is and how I use it" and more trying to
setup a production
enviroment instead.
Anyways, let's have a look at the 50 files that occupy the most space in the zip
file, something might pop up:
less geoserver.war | awk ' BEGIN { OFS = "\t" } {print $3, $8}' | sort
-n -r | head -50
2565346 WEB-INF/lib/xalan-2.7.0.jar
2007443 WEB-INF/lib/je-4.1.7.jar
1916804 WEB-INF/lib/fop-0.94.jar
1852563 WEB-INF/lib/gt-epsg-hsql-2.7.1.jar
1802107 WEB-INF/lib/jai_core-1.1.3.jar
1694717 WEB-INF/lib/wicket-1.4.12.jar
1332241 WEB-INF/lib/bcprov-jdk14-138.jar
1191980 WEB-INF/lib/gt-main-2.7.1.jar
1159519 WEB-INF/lib/h2-1.1.119.jar
1137188 data/coverages/img_sample/usa.png
1104384 WEB-INF/lib/wicket-extensions-1.4.12.jar
1052331 WEB-INF/lib/itext-2.1.5.jar
1051432 WEB-INF/lib/jai_imageio-1.1.jar
997908 WEB-INF/lib/web-core-2.1.0.jar
984040 WEB-INF/lib/gt-referencing-2.7.1.jar
892627 WEB-INF/lib/xercesImpl-2.6.2.jar
795379 WEB-INF/lib/freemarker-2.3.13.jar
688149 WEB-INF/lib/ecore-2.2.2.jar
681764 WEB-INF/lib/spring-security-core-2.0.6.RELEASE.jar
662824 WEB-INF/lib/xsd-2.2.2.jar
643953 WEB-INF/lib/main-2.1.0.jar
633440 WEB-INF/lib/jts-1.11.jar
610755 WEB-INF/lib/hsqldb-1.8.0.7.jar
593440 WEB-INF/lib/ant-optional-1.5.1.jar
531065 WEB-INF/lib/wms-2.1.0.jar
527860 WEB-INF/lib/batik-svg-dom-1.7.jar
513076 WEB-INF/lib/gt-xsd-gml3-2.7.1.jar
508916 WEB-INF/lib/batik-bridge-1.7.jar
501449 WEB-INF/lib/gt-xml-2.7.1.jar
483675 WEB-INF/lib/commons-collections-3.1.jar
462443 WEB-INF/lib/net.opengis.wcs-2.7.1.jar
446114 WEB-INF/lib/postgresql-8.4-701.jdbc3.jar
445020 WEB-INF/lib/gt-metadata-2.7.1.jar
431650 WEB-INF/lib/spring-beans-2.5.5.jar
408916 WEB-INF/lib/spring-context-2.5.5.jar
399796 WEB-INF/lib/gt-coverage-2.7.1.jar
394498 WEB-INF/lib/wfs-2.1.0.jar
390433 data/coverages/arc_sample/precip30min.asc
376922 WEB-INF/lib/xstream-1.3.1.jar
368413 WEB-INF/lib/batik-awt-util-1.7.jar
362416 WEB-INF/lib/mail-1.4.jar
358399 WEB-INF/lib/gt-render-2.7.1.jar
357871 WEB-INF/lib/spring-webmvc-2.5.5.jar
333783 WEB-INF/lib/log4j-1.2.14.jar
315934 WEB-INF/lib/xmlgraphics-commons-1.2.jar
314020 data/data/sf/sfdem.tif
299857 WEB-INF/lib/web-demo-2.1.0.jar
294626 WEB-INF/lib/cglib-nodep-2.1_3.jar
294173 WEB-INF/lib/gt-wfs-2.7.1.jar
285198 WEB-INF/lib/spring-jdbc-2.5.5.jar
I see a few interesting ones there:
2007443 WEB-INF/lib/je-4.1.7.jar
1916804 WEB-INF/lib/fop-0.94.jar
1137188 data/coverages/img_sample/usa.png
1104384 WEB-INF/lib/wicket-extensions-1.4.12.jar
1052331 WEB-INF/lib/itext-2.1.5.jar
527860 WEB-INF/lib/batik-svg-dom-1.7.jar
xalan is the usual large offender. I honestly don't remember
why it's in the mix, it's a direct dependency from the main
module.
Checked, it seems geotools xml and xsd modules do not
use it.
I actually removed from main succesfully, there is nothing
there using it, but there it comes back as a dependency
of batik in the WMS (don't know if it's needed to parse
or to generate SVG).
je is the embedded database GWC is using to store the
quota database. Wow, quite the large jar for a seemingly
small task.
h2/hsql is another obvious duplication... wondering,
there are no many parts that actually use h2 in GS.
I guess it's just the KML superoverlay code?
Wondering how hard it would be to migrate it to hsql.
Updating everything to h2 is another option, epsg-h2
could be used as a replacement for epsg-hsql,
but GWC uses hsql for the metastore right?
fop is a dependency in the apache-batik set, but I'm
wondering if it's really needed (fop is used to
generate documents based on xml and xslt templates
as far as I can remember).
I have to check if we can read/generate SVG without having it
in the mix.
Generally speaking supporting SVG is costing us
quite a large set of dependencies... SVG as an output
format is probably not popular, but being able to read SVG to
use as map symbolizers is important.
I would have to check if we could save significantly by
pushing the SVG map output format to its own extension
(opinions about this one?).
bcprov-jdk14-138.jar is a dependency of itext... that I don't
believe we need. Bouncycastle jars are needed to encrypt
data, I guess itext offers encprited pdf functionality, but
we don't use it so we should be able to get rid of it.
I have to try.
usa.png is large, since it's just an example of the image+world
format we could turn it into a jpeg instead?
It would become 118KB, a 1MB saved just there.
web-core is quite large too, probably due to the large number
of js components... speaking of which, I see editarea is still
in there, but not used anywhere I think.
I can get rid of it and save some (anyone knows if it's used
in any part of the UI? it should have been replaced by
codemirror everywhere).
Wicket extensions is quite large too... but we use it in a number
of places, 30+ references in the code and using a few different
components.
PDF generation is also something that we might push into an
extension, saving another MB.
ant-optional is also in the mix, however if I do a mvn dependency:tree
in web/app I don't see the dependency in the list... wondering how
it gets in the war?
Having the mail jar is a bit ridicolous, I know, it's due to WCS 1.1
demanding the usage of mime/multipart encoding for the
GetCoverage output (yeah, don't get me started on that one...)
With some luck we should be able to shave off a couple megabytes
just with the obvious things (removing bouncycastle stuff, ant-optional,
turning the png image into a jpeg)
Anyone else sees something of interest?
Feedback welcomed.
Cheers
Andrea
--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf
-------------------------------------------------------