[Geoserver-users] Geoserver/Geofence issue - very slow response time

Hello all,

This is my first time using the mailing list. I have always been a fan of all the knowledge shared through this powerful tool. Also, I would like to thank you in advance for any time you may invest in this particular case.

I’m having an issue that, after some good time invested in investigation, seems related to the Geofence plugin.

We use a Geonode installation to serve datasets and maps.

While using the maps, some users occasionally observe a kind of system slowdown, until it stops responding at all. On some occasions, this might affect all users, however, most frequently only some users are affected.

After a vast analysis of any possible bottleneck regarding our infrastructure, we excluded that hypothesis. I will provide some details regarding the infrastructure later in this post.

At some point we’ve enabled the “audit.enable” option in the “monitor.properties” (Monitor plugin) and also elevated the logs to VERBOSE.

What we found out was that, for some reason, the Geofence plugin was taking a huge amount of time to get the response from “services.RuleReaderServiceImpl”.

Here is an extract from the geoserver.log:

2024-05-13 12:54:33.000000 INFO services.RuleReaderServiceImpl Requesting access for RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+]
2024-05-13 13:10:23.000000 DEBUG util.FilterUtils ADDED Rule[id:20358 pri:14322 srv:WMS ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 13:10:23.000000 DEBUG util.FilterUtils ADDED Rule[id:20365 pri:14329 ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 13:10:23.000000 DEBUG services.RuleReaderServiceImpl Filter RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+] is matching the following Rules:
2024-05-13 13:10:23.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS
2024-05-13 13:10:23.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS —> Rule[id:20358 pri:14322 srv:WMS ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 13:10:23.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS —> Rule[id:20365 pri:14329 ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 13:10:23.000000 DEBUG services.RuleReaderServiceImpl Filter RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+] on role ROLE_ANONYMOUS has access AccessInfoInternal[grant:ALLOW]
2024-05-13 13:10:23.000000 INFO services.RuleReaderServiceImpl Returning AccessInfo[grant:ALLOW admin:false] for RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+]

The Geofence took more than 15 minutes to process the rule validation (from 12h54 to 13h10).

However, the same request, just a second before ran just fine:

2024-05-13 12:53:15.000000 INFO services.RuleReaderServiceImpl Requesting access for RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+]
2024-05-13 12:53:15.000000 DEBUG util.FilterUtils ADDED Rule[id:20358 pri:14322 srv:WMS ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 12:53:15.000000 DEBUG util.FilterUtils ADDED Rule[id:20365 pri:14329 ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 12:53:15.000000 DEBUG services.RuleReaderServiceImpl Filter RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+] is matching the following Rules:
2024-05-13 12:53:15.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS
2024-05-13 12:53:15.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS —> Rule[id:20358 pri:14322 srv:WMS ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 12:53:15.000000 DEBUG services.RuleReaderServiceImpl Role:ROLE_ANONYMOUS —> Rule[id:20365 pri:14329 ws:geonode l:trecho_ltm_validado acc:ALLOW]
2024-05-13 12:53:15.000000 DEBUG services.RuleReaderServiceImpl Filter RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+] on role ROLE_ANONYMOUS has access AccessInfoInternal[grant:ALLOW]
2024-05-13 12:53:15.000000 INFO services.RuleReaderServiceImpl Returning AccessInfo[grant:ALLOW admin:false] for RuleFilter[user:“anonymous”+ role:ANY inst:name+:default-gs ip:“177.174.226.207”+ serv:“WMS”+ req:“GETMAP”+ sub:ANY ws:“geonode”+ layer:“trecho_ltm_validado”+]

Just to be clear, this affects not only one request per occurrence, but several requests endure the same issue as demonstrated above.

We checked that everything that happened after the Geofence clears the authorization process, runs always very smoothly. The Control Flow plugin fires up and blocks some requests but it is only because they all got retained in the Geofence plugin in the first place. So, once they are all freed up from Geofence, the Control Flow plugin does its job.

We can visualize how this is affecting user experience by inspecting the average response time graphic below (extracted from the Apache server in front of the Geonode installation). So, we might detect the date and time the application registered any issue, and then we proceed with the log investigation (geoserver.log and geoserver_audit.log). Each time, the pattern below was identified, the Geofence plugin was slowing down some requests.

image.png

This is happening 2 to 3 times per day, every day, with rare exceptions of no occurrence over weekends.

There aren’t that many requests processed per second, as shown below:

image.png

Here are some details regarding the installation and the infrastructure:

  • GeoServer Version: 2.23.0
  • JVM Version Oracle Corporation: 11.0.16 (OpenJDK 64-Bit Server VM)
  • Operating system (host machine): Red Hat Enterprise Linux release 8.9 (Ootpa)

Also, important to note that it’s a Geonode “docker compose” installation, following Geonode best practices.

The Geoserver has 82 Layers, 0 Layer groups, 11 Stores and 1 Workspace. The Geofence plugin has 3379 Data Rules.

The Geofence is running against a Postgres database, as recommended for a production environment. The image in the link below provides some of the current settings:

https://drive.google.com/file/d/1cstSfZUMXHw113XlUFc2L-WwS0RW5l20/view?usp=sharing

The host machine has 8 CPU, 32 Gb of ram and the JVM is set up to 16Gb of ram (set up as -Xms16G and -Xmx16G and +UseG1GC).

The link below provide some logs (geoserver.log and geoserver_audit.log) for two of the occurrences lastly analysed:

https://drive.google.com/drive/folders/1TRkysrQmAKoB6ZeERYh5YqDjdFZ-fx9G?usp=sharing

Thank you again, and any support is more than welcome.

Cesar Rafael Lopes