[Geoserver-devel] Vector tiles pipeline: why simplifying before clipping?

Hi,
following a report of slow vector tiles generation on the list I’m looking at the vector tiles pipeline builder
and found that topology preserving simplification is done before doing the clipping.
The use case reported by the user has a single large polygon detailing the shape of australia, islands included,
that contains 1.5 million vertices.

I’ve tried to switch simplification and clipping, doing clipping before, and the result seems to be much faster,
to the tune of 10 times on an simple “apachebench” benchmark.

Is there any correctness reason as to why simplification is done before clipping?
I’ve tried to dump the same vector tiles twice, with the two operation flipped, the geometries look pretty
much the same, only in one geometry I get the last point repeated twice at the end of the ordinate list (comes
out of the current code).

Cheers
Andrea

···

== GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.

I can’t think of any reason to simplify and then clip as opposed to the reverse, in fact I would expect cases where you’d get a slightly “better” answer by clipping and simplifying.

Ian

···

Ian Turton

Hi Andrea,

Generally, this seems like the much safer approach. I could imagine lines moving a little bit if you do things in the other order.

That said, is there any room to leverage the pre-generalized DataStore tricks along the way here? It seems like that’s a nice technique to dealing with situations where there are millions of vertices.

Cheers,

Jim

···

On 11/11/18 5:05 AM, Andrea Aime wrote:

Hi,
following a report of slow vector tiles generation on the list I’m looking at the vector tiles pipeline builder
and found that topology preserving simplification is done before doing the clipping.
The use case reported by the user has a single large polygon detailing the shape of australia, islands included,
that contains 1.5 million vertices.

I’ve tried to switch simplification and clipping, doing clipping before, and the result seems to be much faster,
to the tune of 10 times on an simple “apachebench” benchmark.

Is there any correctness reason as to why simplification is done before clipping?
I’ve tried to dump the same vector tiles twice, with the two operation flipped, the geometries look pretty
much the same, only in one geometry I get the last point repeated twice at the end of the ordinate list (comes
out of the current code).

Cheers
Andrea

== GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.

_______________________________________________
Geoserver-devel mailing list
[Geoserver-devel@lists.sourceforge.net](mailto:Geoserver-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geoserver-devel](https://lists.sourceforge.net/lists/listinfo/geoserver-devel)

Hello,

I don’t think there’s any good reason for simplifying before clipping. Maybe just not having tested or thought too much about it. I do remember though that without topology preserving simplification we ran into a number of issues with the resulting tiles, to my deception, knowing it’s so much slower than Douglas Peuker.

Cheers,
Gabriel

···

Gabriel Roldán
Software Developer | Boundless
groldan@anonymised.com
@boundlessgeo

Yes indeed, unfortunately the vector tiles spec wants valid polygons, so no “fast lane” for us.
Had a quick look at tippecanoe’s code, seems to be using a normal DP, believe they are then calling
this library to clean it up in post-processing: https://github.com/mapbox/wagyu
I guess it could be faster than JTS topological preserving DP, the spec does not say specifically about topologically valid geoms, but it states “Polygon geometries MUST NOT have any interior rings that intersect and interior rings MUST be enclosed by the exterior ring.”
Maybe doing DP and then fixing the output would be faster, but I’m not sure JTS has a “make valid” operator (it did not, but who knows now)…
cannot see one, but found this: https://locationtech.github.io/jts/javadoc/org/locationtech/jts/simplify/VWSimplifier.html

I’ll prepare a pull request to switch the two ops in the meantime.

Cheers
Andrea

···

GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.

As of JTS 1.16, there isn’t a single makeValid call.:frowning:

When I’ve heard it discussed, it sounds like makeValid functions are usually a bunch of tricks applied one after another to try and fix up a topology.

shrugs

Jim

···

On 11/13/18 9:31 AM, Andrea Aime wrote:

On Tue, Nov 13, 2018 at 3:08 PM Gabriel Roldan <groldan@anonymised.com> wrote:

Hello,

I don’t think there’s any good reason for simplifying before clipping. Maybe just not having tested or thought too much about it. I do remember though that without topology preserving simplification we ran into a number of issues with the resulting tiles, to my deception, knowing it’s so much slower than Douglas Peuker.

Yes indeed, unfortunately the vector tiles spec wants valid polygons, so no “fast lane” for us.
Had a quick look at tippecanoe’s code, seems to be using a normal DP, believe they are then calling
this library to clean it up in post-processing: https://github.com/mapbox/wagyu
I guess it could be faster than JTS topological preserving DP, the spec does not say specifically about topologically valid geoms, but it states “Polygon geometries MUST NOT have any interior rings that intersect and interior rings MUST be enclosed by the exterior ring.”
Maybe doing DP and then fixing the output would be faster, but I’m not sure JTS has a “make valid” operator (it did not, but who knows now)…
cannot see one, but found this: https://locationtech.github.io/jts/javadoc/org/locationtech/jts/simplify/VWSimplifier.html

I’ll prepare a pull request to switch the two ops in the meantime.

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.

_______________________________________________
Geoserver-devel mailing list
[Geoserver-devel@lists.sourceforge.net](mailto:Geoserver-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geoserver-devel](https://lists.sourceforge.net/lists/listinfo/geoserver-devel)

Pull request available here:

https://github.com/geoserver/geoserver/pull/3246

Cheers
Andrea

···

GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.