[Geoserver-devel] Efficient filtering of small polygons, and avoiding labeling issues with large ones

Hi,
working on the OSM-bright clone I’ve stumbled into an interesting approach that couples
avoiding loading polygons that are too small, along with eliminating the issue of repeated
labels on very large polygons.

Here is OSM-bright approach:

  • Precompute the area of the polygons (called way_area in the database)

  • Have variables providing the width/height of a pixel, in meters, pixel_width, pixel_height (in OSM the data is both stored and rendered in 3857)

  • Use these to avoid loading polygons whose area is too small to contribute anything significant, e.g.:

  • “… AND way_area > 100*!pixel_width!::real*!pixel_height!::real” (used to load polygons for labelling purposes, only if they are at least 100 square pixels)

  • “… AND way_area > 0.01*!pixel_width!::real*!pixel_height!::real” (used to avoid loading polygons that are just too small, less than 0.01 square pixels)- Use these to avoid painting labels if the polygon they refer too became too big on screen (e.g., zooming on a country, at some point you don’t want to add the country name anymore), combining:

  • “SELECT … way_area/NULLIF(!pixel_width!::real*!pixel_height!::real,0) AS way_pixels” in the query

  • " [zoom >= 3][way_pixels > 1000][way_pixels < 360000] {" in the style (in other words, paint the label if the polygon is at least 100x10 and no more than 600x600)

The approach using way_pixels is neat in that a pure scale dependency does not account for the relative size of countries, e.g., while one wants to hide “Germany” soon enough, at that same zoom level “Liechtenstein” (pick any city state here) still needs to be visible.

So, I want to have something similar. A pure rendering based approach would be to add a “max_polygon_area” vendor option, to be expressed in pixels (“goodness_of_fit” already takes care of the too small ones).
However, this still loads everything to decide, later, if it’s useful to draw or drop a certain label, with the amount of data in OSM it’s smart to pass this information down to the database.
However… this requires assuming that there is a pre-calculated area in the db with a given name (area calculation is not super-fast, especially for complex polygons).

So instead, I’m thinking of adding a filter function that would take a target CRS and would return the area of a pixel in that CRS, which can then be used
in a filter, for example:

[(way_area / pixel_area(‘EPSG:3857’)) between 1000 AND 360000] { label: … }

The data access code will recognize that pixel_area does not refer to attributes, and will optimize it out with a static value
before running the query (making it go fully down in the dbms).
Implementation wise, the pixel_area function will be in gs-wms and will leverage the existing wms_* env variables to do
a rough job (no need to make it precise, but needs to be robus), like bbox area over pixel area, and then do a unit conversion
if necessary (using tables to get a rough degree to meters conversion, if necessary).

What do you think?

Regards,

···

Andrea Aime

==
GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.

Certainly sounds like an interesting idea. I suspect that the devil is in the details since we have to manage many projections and data sets but your approach seems like it should work.

Ian

···

On 6 Aug 2017 1:06 p.m., “Andrea Aime” <andrea.aime@anonymised.com.> wrote:

Hi,
working on the OSM-bright clone I’ve stumbled into an interesting approach that couples
avoiding loading polygons that are too small, along with eliminating the issue of repeated
labels on very large polygons.

Here is OSM-bright approach:

  • Precompute the area of the polygons (called way_area in the database)

  • Have variables providing the width/height of a pixel, in meters, pixel_width, pixel_height (in OSM the data is both stored and rendered in 3857)

  • Use these to avoid loading polygons whose area is too small to contribute anything significant, e.g.:

  • “… AND way_area > 100*!pixel_width!::real*!pixel_height!::real” (used to load polygons for labelling purposes, only if they are at least 100 square pixels)

  • “… AND way_area > 0.01*!pixel_width!::real*!pixel_height!::real” (used to avoid loading polygons that are just too small, less than 0.01 square pixels)- Use these to avoid painting labels if the polygon they refer too became too big on screen (e.g., zooming on a country, at some point you don’t want to add the country name anymore), combining:

  • “SELECT … way_area/NULLIF(!pixel_width!::real*!pixel_height!::real,0) AS way_pixels” in the query

  • " [zoom >= 3][way_pixels > 1000][way_pixels < 360000] {" in the style (in other words, paint the label if the polygon is at least 100x10 and no more than 600x600)

The approach using way_pixels is neat in that a pure scale dependency does not account for the relative size of countries, e.g., while one wants to hide “Germany” soon enough, at that same zoom level “Liechtenstein” (pick any city state here) still needs to be visible.

So, I want to have something similar. A pure rendering based approach would be to add a “max_polygon_area” vendor option, to be expressed in pixels (“goodness_of_fit” already takes care of the too small ones).
However, this still loads everything to decide, later, if it’s useful to draw or drop a certain label, with the amount of data in OSM it’s smart to pass this information down to the database.
However… this requires assuming that there is a pre-calculated area in the db with a given name (area calculation is not super-fast, especially for complex polygons).

So instead, I’m thinking of adding a filter function that would take a target CRS and would return the area of a pixel in that CRS, which can then be used
in a filter, for example:

[(way_area / pixel_area(‘EPSG:3857’)) between 1000 AND 360000] { label: … }

The data access code will recognize that pixel_area does not refer to attributes, and will optimize it out with a static value
before running the query (making it go fully down in the dbms).
Implementation wise, the pixel_area function will be in gs-wms and will leverage the existing wms_* env variables to do
a rough job (no need to make it precise, but needs to be robus), like bbox area over pixel area, and then do a unit conversion
if necessary (using tables to get a rough degree to meters conversion, if necessary).

What do you think?

Regards,

Andrea Aime

==
GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Check out the vibrant tech community on one of the world’s most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


Geoserver-devel mailing list
Geoserver-devel@anonymised.com.366…sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel