[GeoNetwork-devel] Memory leak on geonetwork master and maybe 2.8

Hi,

I am aware that this email comes at a bad time :-(. But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes. After about 10 hours the system would crash. It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called. As a result every searcher ever created was being kept in memory, along with the full lucene index for that version. Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created. I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue? I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue. But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse

Hi Jesse

Seem finally the 2.8 release didn’t happen yesterday.

From your comments seem a critical issue that prevents to use properly GeoNetwork in production. I would propose then to commit to 2.8.x branch and do some testing about this. Possibly the best is to delay 2.8.0 release until next week then, and inform the users list about this decision.

But would like to hear also the opinion from others.

Regards,
Jose García

On Fri, Feb 15, 2013 at 8:13 AM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

Hi,

I am aware that this email comes at a bad time :-(. But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes. After about 10 hours the system would crash. It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called. As a result every searcher ever created was being kept in memory, along with the full lucene index for that version. Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created. I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue? I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue. But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse


Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoCat Bridge for ArcGIS allows instant publishing of data and metadata on GeoServer and GeoNetwork. Visit http://geocat.net for details.


Jose García
GeoCat bv
Veenderweg 13
6721 WD Bennekom
The Netherlands
http://GeoCat.net

Tricky - I suppose it does only affect those with high number of requests per day and it can be worked around by running it in its own tomcat/jetty instance and restarting regularly (something which is not uncommon for many web apps)? Is the 2.8.0 release actually done and ready to roll? If so, then the bug report has come after 2.8.0 release anyway. If 2.8.0 is ready then I think we just go with it and work on getting 2.8.1 out asap.

Cheers,
Simon
________________________________________
From: Jose Garcia [jose.garcia@anonymised.com]
Sent: Friday, 15 February 2013 6:33 PM
To: Jesse Eichar
Cc: Devel geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] Memory leak on geonetwork master and maybe 2.8

Hi Jesse

Seem finally the 2.8 release didn't happen yesterday.

From your comments seem a critical issue that prevents to use properly GeoNetwork in production. I would propose then to commit to 2.8.x branch and do some testing about this. Possibly the best is to delay 2.8.0 release until next week then, and inform the users list about this decision.

But would like to hear also the opinion from others.

Regards,
Jose García

On Fri, Feb 15, 2013 at 8:13 AM, Jesse Eichar <jesse.eichar@anonymised.com<mailto:jesse.eichar@anonymised.com>> wrote:
Hi,

I am aware that this email comes at a bad time :-(. But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes. After about 10 hours the system would crash. It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called. As a result every searcher ever created was being kept in memory, along with the full lucene index for that version. Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created. I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue? I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue. But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.

_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net<mailto:GeoNetwork-devel@anonymised.comforge.net>

GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

--
GeoCat Bridge for ArcGIS allows instant publishing of data and metadata on GeoServer and GeoNetwork. Visit http://geocat.net/&gt; for details.
_________________________
Jose García
GeoCat bv
Veenderweg 13
6721 WD Bennekom
The Netherlands
http://GeoCat.net/&gt;

Dear all,

Maybe off-topic but just in case it’s not.
Jesse mentioned that “the code that purges old searchers was never being called”, could there be a link to the “Too many open files error” thread we had in Oct/Nov 2012 ?

End of November, I thought patches you committed to both 2.8 and 2.6.5 solved the issue on our 2.6.5 version (see : http://osgeo-org.1560.n6.nabble.com/GN-2-6-4-Java-1-7-Too-many-open-files-error-tt4982753.html#a5019290).

But, progressively GN moved back to its old behaviour and we need to restart it on a weekly basis again to avoid the “Too many open files error”.

Cheers,
Sylvain

The too many open files issue was fixed (we think) by using searcher managers. However making the strategy work with the multilingual indexing was non-trivial and it was this implementation that had the bug I mentioned. In this new case it is a memory leak not a resource leak.

Jesse

···

On Fri, Feb 15, 2013 at 10:20 AM, Sylvain GRELLET <s.grellet@anonymised.com> wrote:

Dear all,

Maybe off-topic but just in case it’s not.
Jesse mentioned that “the code that purges old searchers was never being called”, could there be a link to the “Too many open files error” thread we had in Oct/Nov 2012 ?

End of November, I thought patches you committed to both 2.8 and 2.6.5 solved the issue on our 2.6.5 version (see : http://osgeo-org.1560.n6.nabble.com/GN-2-6-4-Java-1-7-Too-many-open-files-error-tt4982753.html#a5019290).

But, progressively GN moved back to its old behaviour and we need to restart it on a weekly basis again to avoid the “Too many open files error”.

Cheers,
Sylvain

Le 15/02/2013 09:47, Simon.Pigot@anonymised.com a écrit :

Tricky - I suppose it does only affect those with high number of requests per day and it can be worked around by running it in its own tomcat/jetty instance and restarting regularly (something which is not uncommon for many web apps)? Is the 2.8.0 release actually done and ready to roll? If so, then the bug report has come after 2.8.0 release anyway. If 2.8.0 is ready then I think we just go with it and work on getting 2.8.1 out asap.

Cheers,
Simon
________________________________________
From: Jose Garcia [[jose.garcia@anonymised.com](mailto:jose.garcia@anonymised.com)]
Sent: Friday, 15 February 2013 6:33 PM
To: Jesse Eichar
Cc: Devel [geonetwork-devel@lists.sourceforge.net](mailto:geonetwork-devel@lists.sourceforge.net)
Subject: Re: [GeoNetwork-devel] Memory leak on geonetwork master and maybe      2.8

Hi Jesse

Seem finally the 2.8 release didn't happen yesterday.

>From your comments seem a critical issue that prevents to use properly GeoNetwork in production. I would propose then to commit to 2.8.x branch and do some testing about this. Possibly the best is to delay 2.8.0 release until next week then, and inform the users list about this decision.

But would like to hear also the opinion from others.

Regards,
Jose García

On Fri, Feb 15, 2013 at 8:13 AM, Jesse Eichar <[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)[<mailto:jesse.eichar@anonymised.com9...>](mailto:jesse.eichar@anonymised.com)> wrote:
Hi,

I am aware that this email comes at a bad time :-(.  But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes.  After about 10 hours the system would crash.  It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called.  As a result every searcher ever created was being kept in memory, along with the full lucene index for that version.  Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created.  I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue?  I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue.  But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
[http://p.sf.net/sfu/sophos-d2d-feb](http://p.sf.net/sfu/sophos-d2d-feb)
_______________________________________________
GeoNetwork-devel mailing list
[GeoNetwork-devel@lists.sourceforge.net](mailto:GeoNetwork-devel@lists.sourceforge.net)[<mailto:GeoNetwork-devel@anonymised.comts.sourceforge.net>](mailto:GeoNetwork-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geonetwork-devel](https://lists.sourceforge.net/lists/listinfo/geonetwork-devel)
GeoNetwork OpenSource is maintained at [http://sourceforge.net/projects/geonetwork](http://sourceforge.net/projects/geonetwork)

--
GeoCat Bridge for ArcGIS allows instant publishing of data and metadata on GeoServer and GeoNetwork. Visit [http://geocat.net](http://geocat.net)[<http://geocat.net/>](http://geocat.net/) for details.
_________________________
Jose García
GeoCat bv
Veenderweg 13
6721 WD Bennekom
The Netherlands
[http://GeoCat.net](http://GeoCat.net)[<http://geocat.net/>](http://geocat.net/)

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
[http://p.sf.net/sfu/sophos-d2d-feb](http://p.sf.net/sfu/sophos-d2d-feb)
_______________________________________________
GeoNetwork-devel mailing list
[GeoNetwork-devel@lists.sourceforge.net](mailto:GeoNetwork-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geonetwork-devel](https://lists.sourceforge.net/lists/listinfo/geonetwork-devel)
GeoNetwork OpenSource is maintained at [http://sourceforge.net/projects/geonetwork](http://sourceforge.net/projects/geonetwork)

ok.
I got it wrong, sorry.
Indeed having many file handles won’t consume so much memory.

Sylvain

···

On Fri, Feb 15, 2013 at 10:20 AM, Sylvain GRELLET <s.grellet@anonymised.com> wrote:

Dear all,

Maybe off-topic but just in case it’s not.
Jesse mentioned that “the code that purges old searchers was never being called”, could there be a link to the “Too many open files error” thread we had in Oct/Nov 2012 ?

End of November, I thought patches you committed to both 2.8 and 2.6.5 solved the issue on our 2.6.5 version (see : http://osgeo-org.1560.n6.nabble.com/GN-2-6-4-Java-1-7-Too-many-open-files-error-tt4982753.html#a5019290).

But, progressively GN moved back to its old behaviour and we need to restart it on a weekly basis again to avoid the “Too many open files error”.

Cheers,
Sylvain

Le 15/02/2013 09:47, Simon.Pigot@anonymised.com a écrit :

Tricky - I suppose it does only affect those with high number of requests per day and it can be worked around by running it in its own tomcat/jetty instance and restarting regularly (something which is not uncommon for many web apps)? Is the 2.8.0 release actually done and ready to roll? If so, then the bug report has come after 2.8.0 release anyway. If 2.8.0 is ready then I think we just go with it and work on getting 2.8.1 out asap.

Cheers,
Simon
________________________________________
From: Jose Garcia [[jose.garcia@anonymised.com](mailto:jose.garcia@anonymised.com..437...)]
Sent: Friday, 15 February 2013 6:33 PM
To: Jesse Eichar
Cc: Devel [geonetwork-devel@lists.sourceforge.net](mailto:geonetwork-devel@anonymised.com.sourceforge.net)
Subject: Re: [GeoNetwork-devel] Memory leak on geonetwork master and maybe      2.8

Hi Jesse

Seem finally the 2.8 release didn't happen yesterday.

>From your comments seem a critical issue that prevents to use properly GeoNetwork in production. I would propose then to commit to 2.8.x branch and do some testing about this. Possibly the best is to delay 2.8.0 release until next week then, and inform the users list about this decision.

But would like to hear also the opinion from others.

Regards,
Jose García

On Fri, Feb 15, 2013 at 8:13 AM, Jesse Eichar <[jesse.eichar@anonymised.com.189...](mailto:jesse.eichar@anonymised.com)[<mailto:jesse.eichar@anonymised.com>](mailto:jesse.eichar@anonymised.com.)> wrote:
Hi,

I am aware that this email comes at a bad time :-(.  But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes.  After about 10 hours the system would crash.  It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called.  As a result every searcher ever created was being kept in memory, along with the full lucene index for that version.  Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created.  I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue?  I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue.  But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
[http://p.sf.net/sfu/sophos-d2d-feb](http://p.sf.net/sfu/sophos-d2d-feb)
_______________________________________________
GeoNetwork-devel mailing list
[GeoNetwork-devel@lists.sourceforge.net](mailto:GeoNetwork-devel@anonymised.comge.net)[<mailto:GeoNetwork-devel@lists.sourceforge.net>](mailto:GeoNetwork-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geonetwork-devel](https://lists.sourceforge.net/lists/listinfo/geonetwork-devel)
GeoNetwork OpenSource is maintained at [http://sourceforge.net/projects/geonetwork](http://sourceforge.net/projects/geonetwork)

--
GeoCat Bridge for ArcGIS allows instant publishing of data and metadata on GeoServer and GeoNetwork. Visit [http://geocat.net](http://geocat.net)[<http://geocat.net/>](http://geocat.net/) for details.
_________________________
Jose García
GeoCat bv
Veenderweg 13
6721 WD Bennekom
The Netherlands
[http://GeoCat.net](http://GeoCat.net)[<http://geocat.net/>](http://geocat.net/)

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
[http://p.sf.net/sfu/sophos-d2d-feb](http://p.sf.net/sfu/sophos-d2d-feb)
_______________________________________________
GeoNetwork-devel mailing list
[GeoNetwork-devel@lists.sourceforge.net](mailto:GeoNetwork-devel@anonymised.comge.net)
[https://lists.sourceforge.net/lists/listinfo/geonetwork-devel](https://lists.sourceforge.net/lists/listinfo/geonetwork-devel)
GeoNetwork OpenSource is maintained at [http://sourceforge.net/projects/geonetwork](http://sourceforge.net/projects/geonetwork)

Hi Jesse,

Do you know if this issue affects memory usage when harvesting?

many thanks,

Brian.

···

On 15/02/13 07:33, Jose Garcia wrote:

Hi Jesse

Seem finally the 2.8 release didn’t happen yesterday.

From your comments seem a critical issue that prevents to use properly GeoNetwork in production. I would propose then to commit to 2.8.x branch and do some testing about this. Possibly the best is to delay 2.8.0 release until next week then, and inform the users list about this decision.

But would like to hear also the opinion from others.

Regards,
Jose García

On Fri, Feb 15, 2013 at 8:13 AM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

Hi,

I am aware that this email comes at a bad time :-(. But I only discovered this issue the day before yesterday (wednesday) and only fixed it yesterday evening (thursday).

Here is the story:

I recently (tuesday) added a long term performance test on Geocat.ch integration server (which is based on the latest geonetwork-core code.) The test consists of a jmeter suite that runs every hour and hammers the system with several hundred requests in only a couple of minutes. After about 10 hours the system would crash. It was a memory leak in the handling of lucene searchers.

It turned out that the code that purges old searchers was never being called. As a result every searcher ever created was being kept in memory, along with the full lucene index for that version. Since the index is updated every time a metadata is accessed the index is updated frequently, and thus new searchers are also frequently being created. I was finding hundreds/thousands of searchers were held in memory.

I have made a series of commits to master that fixes this issue but I have not tested 2.8 branch.

My Question:

What should I do about the 2.8 release in light of this issue? I think 2.8 branch is currently used by people so under normal/light loads this might not be a big issue. But in my experience under medium/high loads it is a critical issue.

Jesse

Jesse


Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoCat Bridge for ArcGIS allows instant publishing of data and metadata on GeoServer and GeoNetwork. Visit http://geocat.net for details.


Jose García
GeoCat bv
Veenderweg 13
6721 WD Bennekom
The Netherlands
http://GeoCat.net

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
[http://p.sf.net/sfu/sophos-d2d-feb](http://p.sf.net/sfu/sophos-d2d-feb)
_______________________________________________
GeoNetwork-devel mailing list
[GeoNetwork-devel@lists.sourceforge.net](mailto:GeoNetwork-devel@lists.sourceforge.net)
[https://lists.sourceforge.net/lists/listinfo/geonetwork-devel](https://lists.sourceforge.net/lists/listinfo/geonetwork-devel)
GeoNetwork OpenSource is maintained at [http://sourceforge.net/projects/geonetwork](http://sourceforge.net/projects/geonetwork)