Hi,
in a separate thread Jonathan provided a load test, data and styles for a map he’s
publishing.
The thread started as a comparison between PNG encoders, but, at least as far as I can
tell, there is no significant optimization that can be made at the PNG encoder level.
However, I’ve been toying with the data set and found some interesting results regardless
(grab some popcorn).
The maps are setup as a group of 15 different layers, with some scale dependency.
At 10 and 5 millions not all the layers are shown, but there is one thing that is evident: there
are a few layers (woodlands, lakes, and the portion of urban areas shown at this lavel) that
contain a good number of small, but detailed polygons (in the order of 20k+)… this makes
it miss the reference “no more than 1000 elements in the map” mark by a factor of 20, which
is probably what makes it interesting :-p
We have code that simplifies them before drawing them, but they are small enough that
another optimization, geared towards drawing features smaller than a pixel, should have kicked
in, but it did not and that affected drawing times. So I’ve fixed it (fixes are already in, will
be part of the next stable release).
In order to get the fully benefit the styles must not use partial transparency, and I had
to fix one of the styles accordingly (believe it was lakes, not 100% sure).
To have some comparisons, I’m reporting the numbers found in the other thread without any
change. The report contains the JVM and build details, and then average response time and throughput
(for those that did not follow, this is a 15 layers group, made of shapefiles,
and the output is a 1272x1261 pixels full color png, with 10 concurrent clients hitting GeoServer,
the benchmark is done on a core i7 820 CPU). All benchmarks are using the PNGJ encoder,
for those not familar with benchmarking, the Oracle JDK 7 has a renderer that does not scale up,
so it is worthy setting up a number of load balanced GeoServer instances in separate JVMs,
whilst OpenJDK has one that’s quite a bit slower, but that has no scalability issues, so one
normally sets up a single instance instead.
OpenJDK 7: , 3.3r/s
JDK 7: 2867ms, 3.9r/s
JDK 7, three instances load balanced with HA proxy: 1248ms, 7.9r/s
The above shows the benchmark is heavily bottlenecked by the rendering subsystem drawing speed
(so much that the JDK 7 renderer, Ductus, can outperform OpenJDK 7 one, Pisces, even with 10
concurrent requests).
After fixing the “small polygons” optimization bug the results are:
OpenJDK 7: 2507ms, 4r/s
JDK 7: 2459ms, 4r/s
JDK 7, three instances load balanced with HA proxy: 1175ms, 8.4r/s
So, this optimization improves the rendering time of OpenJDK 7 and puts it on par with the
closed source JDK, and also improves the latter.
Then again, Jonathan has this data in Oracle… I did not want to venture there: loading
data there is a pain, the styles need fixing because of the uppercase attributes,
and honestly, why waste an open source developer spare time on closed source databases anyways?
So I loaded all the data in PostGIS instead and run the tests again (test run with the connection pool
locked at 10 connections):
OpenJDK 7: 2630ms, 3.8r/s
JDK 7: 2422ms, 4.1r/s
JDK 7, three instances load balanced with HA proxy: 1301ms, 7.6r/s
Hum… a bit slower. And then I’ve remembered what Paul Ramsey used to say about Mapnik using
ST_Simplify on the geometries to get a boost (reference, http://blog.cartodb.com/post/20163722809/speeding-up-tiles-rendering).
Which had been tried already in GeoServer without much of a benefit
in previous benchmarks (we already do the simplification on the java side), but… we did not have a map
with so many little polygons. So why not give it a kick? All one needs to do is to add the following in the postgis dialect:
@Override
public void encodeGeometryColumnSimplified(GeometryDescriptor gatt, String prefix, int srid,
StringBuffer sql, Double distance) {
boolean geography = “geography”.equals(gatt.getUserData().get(
JDBCDataStore.JDBC_NATIVE_TYPENAME));
if (geography) {
sql.append(“encode(ST_AsBinary(ST_Simplify(”);
encodeColumnName(prefix, gatt.getLocalName(), sql);
sql.append(“, " + distance + “)),‘base64’)”);
} else {
sql.append(“encode(ST_AsBinary(ST_Simplify(ST_Force_2D(”);
encodeColumnName(prefix, gatt.getLocalName(), sql);
sql.append(”), " + distance + “)),‘base64’)”);
}
}
protected void addSupportedHints(Set<Hints.Key> hints) {
hints.add(Hints.GEOMETRY_SIMPLIFICATION);
}
And here are the results:
OpenJDK 7: 2014 ms, 4.9 r/s
JDK 7: 1694ms, 5.8r/s
JDK 7, three instances load balanced with HA proxy: 1046 ms, 9.4 r/s
Holy cow, on the single JVM setup that’s roughly a 50% speedup (and not to throw away in the
case of 3 JVMs either)…
Now, this is PostGIS specific (no
plain simplification in Oracle, only the topology preserving one is available, which
is supposedly more expensive than not doing it, at least, it is in PostGIS),
and we need to verify what’s the overhead when the geometry do not need simplification.
That said, how do you see this being enabled?
- always on, that could be a good one if the overhead when simplification is not really needed
show no regression - have it enabled by a store parameter
- have it enabled with SLD vendor parameter (to be used only when zoomed out)
Now… another thing that I suspected made MapServer and Mapnik competitive is that
they load the data from the database in a single kick, instead of paging through them like we do.
The approach is a double edged sword:
- on the bright side, a single communication with the db, and no need for the dbms to allocate
and manage a server side cursor - on the dark side, no freaking way to control how much data you’re loading into memory, OOM
risk is there (I guess with a cgi/fastcgi approach you just don’t care, if one instance goes boom
it just gets replaced automatically, worst thing that can happen you get into a swap storm)
Code wise the change is small, in the PostGIS dialect we just disable the use of transactions while
reading, which breaks postgresql ability to use server side cursors, forcing it to return everything in
a single kick instead:
@Override
public boolean isAutoCommitQuery() {
return true;
}
Let’s see the results:
OpenJDK 7: 1726 ms, 5.8 r/s
JDK 7: 1688ms, 5.9r/s
JDK 7, three instances load balanced with HA proxy: 998 ms, 9.9 r/s
The one that benefits the most are the ones CPU bottlenecked, which means, first OpenJDK,
and then the three instances of JDK 7, which are both using close to 100% cpu, whilst
JDK 7 stand alone instances uses like 50% cpu (the JVM wide lock in the Ductus renderer
is killing scalabiliity).
The issue is, this approach is not really usable “always on”. Options I see (and suggestions
welcomed):
- test again and see if just increasing the fetch size helps (it’s now set 1000 by default)
- add a hint that only the WMS renderer sets, that will enable this mode… this assumes
the styles are always set to provide reasonable scale dependencies… and we should add
some config to disable the usage of styles that are not associated to the layer - other suggestions?
Now, for a final test, we know that the OpenJDK renderer is slow, but Laurent Bourges
has been providing Oracle some patches to make it faster. Which have not been accepted, so far.
However… I’ve made some tests and now have a jar that one can drop in a regular JDK 7
install, which just some of Laurent’s improvements, and enable it by setting some JVM
parameters. How does this one fare? This are the results for a single OpenJDK 7
JVM, with also all of the other GeoServer patches mentioned above included:
OpenJDK 7 + optimized Pisces renderer: 1091 ms, 9.0 r/s
Not too bad uh?
The plan is to keep on work on it a bit more before going public and releasing the jar for
everybody to play with.
Cheers
Andrea
–
== GeoSolutions will be closed for seasonal holidays from 23/12/2013 to 06/01/2014 ==
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
http://www.geo-solutions.it
http://twitter.com/geosolutions_it