[GeoNetwork-devel] Weird deadlock in the Geonetwork database

Hi List,

I just wanted to post about a weird geonetwork problem I encountered the other day, I think caused by a very unusual circumstance. By posting it here, hopefully if someone else does encounter the problem they will find it!

I have a Geonetwork 3.10 deployment on AWS ECS with a PostgreSQL database on AWS RDS, which has been running very well for months with no problems. Last week the customer reported that they couldn’t save edits to metadata records. I investigated and could see the same problem, along with general slow performance. The Geonetwork log files complained about Java Heap Space, although that was set to a sensible amount for the server. I redeployed with more memory, but the problem didn’t go away. We restarted the server, the load balancer and the RDS but things didn’t improve. We considered whether someone had tried to hack the server- but couldn’t see anything.

Since the problem seemed related to saving changes back to the database, I looked into the RDS logs. There were no obvious errors, but there were statements about ‘error while locking tuple(1,77) in relation “settings”’. Once I figured out what that meant, I looked in the settings table and saw that the values for the proxy username and password were wrong (they shouldn’t be filled in at all). Not only were they wrong, but they were different to what I could see when I looked in the admin console user interface in the browser. I updated the database table to set the values back to blank, and the performance of the server recovered.

Given that I have occasionally noticed the proxy username and password have been auto-filled with the admin user’s credentials before now, the only conclusion I can come to was that two admin users ended up with the credentials being auto-filled by the browser without them realising, then they tried to make a genuine change at exactly the same time, and caused a deadlock in the database that was only cleared when I manually updated the table.

If my theory about what caused this is true, then should we be using autocomplete=off in the form for the admin settings to try avoid this happening in future?

All the best

Jo

···

Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009

Hi Jo

I think any password field should have autocomplete=off

Regards,
Jose García

···

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.