[GeoNetwork-devel] CSW SearchController and CatalogSearcher

Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

I have some fixes: Look at:

https://github.com/jesseeichar/core-geonetwork/compare/master…bug;csw_getrecords_parallelism

to see the changes I made.

With these changes I have been able to run 10 users simultaneously making csw requests, both with bbox and without.

The track ticket is:

http://trac.osgeo.org/geonetwork/ticket/1073

Jesse

On Tue, Sep 25, 2012 at 3:19 PM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

Hi Jesse,

It's the same/similar problem with how we handle search in the user interface isn't it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It's a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I've always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don't need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon
________________________________________
From: Jesse Eichar [jesse.eichar@anonymised.com]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won't work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.com<mailto:jesse.eichar@anonymised.com>> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a "toclose" set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in "toclose" set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in "toclose" set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

Do you have a patch for the required changes for lucene 3.6 somewhere?

I have just used JMeter to test the system and my proposed solution (from a performance POV) doesn’t seem to have much of an negative effect. So I would like to commit this solution for now. For the Lucene 3.6 fix it will be trivial to migrate my solution to use the token based solution that you mention.

Shall I commit?

Jesse

On Tue, Sep 25, 2012 at 4:34 PM, <Simon.Pigot@anonymised.com> wrote:

Hi Jesse,

It’s the same/similar problem with how we handle search in the user interface isn’t it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It’s a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I’ve always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don’t need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon


From: Jesse Eichar [jesse.eichar@…189…]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

Hi Jesse,

I think you should commit yours now as it's a critical fix for 2.8.x - the Lucene 3.6 change is better suited for trunk - I do have a patch but it is an old version of GeoNetwork (2.4/2.6.x) and predates the Lucene multi-language indexing changes - I'll adapt it and we can use it for trunk.

Speedy work on the fix - good one!

Cheers and thanks,
Simon

________________________________________
From: Jesse Eichar [jesse.eichar@anonymised.com]
Sent: Wednesday, 26 September 2012 12:38 AM
To: Pigot, Simon (CMAR, Hobart)
Cc: geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

Do you have a patch for the required changes for lucene 3.6 somewhere?

I have just used JMeter to test the system and my proposed solution (from a performance POV) doesn't seem to have much of an negative effect. So I would like to commit this solution for now. For the Lucene 3.6 fix it will be trivial to migrate my solution to use the token based solution that you mention.

Shall I commit?

Jesse

On Tue, Sep 25, 2012 at 4:34 PM, <Simon.Pigot@anonymised.com<mailto:Simon.Pigot@…192…>> wrote:
Hi Jesse,

It's the same/similar problem with how we handle search in the user interface isn't it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It's a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I've always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don't need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon
________________________________________
From: Jesse Eichar [jesse.eichar@anonymised.com<mailto:jesse.eichar@anonymised.com>]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.net<mailto:geonetwork-devel@anonymised.comsts.sourceforge.net>
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won't work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.com<mailto:jesse.eichar@anonymised.com><mailto:jesse.eichar@anonymised.com>> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a "toclose" set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in "toclose" set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in "toclose" set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse

I have put it on the 2.8.x and master branches. I think it would be good on 2.6.x as well. I quickly evaluate the amount of work to do that.

Jesse

On Tue, Sep 25, 2012 at 5:02 PM, Jeroen Ticheler <jeroen.ticheler@anonymised.com> wrote:

Hi Jesse,
Good work! Is this something we should also port and commit to 2.6.x and make a bug fix release for that as well!? 2.6.5 is something that we should have released anyway in my opinion.
Cheers,
Jeroen

On 25 sep. 2012, at 16:44, Simon.Pigot@anonymised.com wrote:

Hi Jesse,

I think you should commit yours now as it’s a critical fix for 2.8.x - the Lucene 3.6 change is better suited for trunk - I do have a patch but it is an old version of GeoNetwork (2.4/2.6.x) and predates the Lucene multi-language indexing changes - I’ll adapt it and we can use it for trunk.

Speedy work on the fix - good one!

Cheers and thanks,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.com]
Sent: Wednesday, 26 September 2012 12:38 AM
To: Pigot, Simon (CMAR, Hobart)
Cc: geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

Do you have a patch for the required changes for lucene 3.6 somewhere?

I have just used JMeter to test the system and my proposed solution (from a performance POV) doesn’t seem to have much of an negative effect. So I would like to commit this solution for now. For the Lucene 3.6 fix it will be trivial to migrate my solution to use the token based solution that you mention.

Shall I commit?

Jesse

On Tue, Sep 25, 2012 at 4:34 PM, <Simon.Pigot@anonymised.commailto:[Simon.Pigot@anonymised.com](mailto:Simon.Pigot@anonymised.com)> wrote:
Hi Jesse,

It’s the same/similar problem with how we handle search in the user interface isn’t it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It’s a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I’ve always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don’t need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.netmailto:[geonetwork-devel@lists.sourceforge.net](mailto:geonetwork-devel@lists.sourceforge.net)
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)<mailto:jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)>> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse


Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

It turned out to not be too hard to back port. I just pushed the changes to 2.6.x as well.

Jesse

On Tue, Sep 25, 2012 at 5:06 PM, Jesse Eichar <jesse.eichar@anonymised.com189…> wrote:

I have put it on the 2.8.x and master branches. I think it would be good on 2.6.x as well. I quickly evaluate the amount of work to do that.

Jesse

On Tue, Sep 25, 2012 at 5:02 PM, Jeroen Ticheler <jeroen.ticheler@anonymised.com> wrote:

Hi Jesse,
Good work! Is this something we should also port and commit to 2.6.x and make a bug fix release for that as well!? 2.6.5 is something that we should have released anyway in my opinion.
Cheers,
Jeroen

On 25 sep. 2012, at 16:44, Simon.Pigot@anonymised.com wrote:

Hi Jesse,

I think you should commit yours now as it’s a critical fix for 2.8.x - the Lucene 3.6 change is better suited for trunk - I do have a patch but it is an old version of GeoNetwork (2.4/2.6.x) and predates the Lucene multi-language indexing changes - I’ll adapt it and we can use it for trunk.

Speedy work on the fix - good one!

Cheers and thanks,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.com]
Sent: Wednesday, 26 September 2012 12:38 AM
To: Pigot, Simon (CMAR, Hobart)
Cc: geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

Do you have a patch for the required changes for lucene 3.6 somewhere?

I have just used JMeter to test the system and my proposed solution (from a performance POV) doesn’t seem to have much of an negative effect. So I would like to commit this solution for now. For the Lucene 3.6 fix it will be trivial to migrate my solution to use the token based solution that you mention.

Shall I commit?

Jesse

On Tue, Sep 25, 2012 at 4:34 PM, <Simon.Pigot@anonymised.commailto:[Simon.Pigot@anonymised.com2...](mailto:Simon.Pigot@anonymised.com)> wrote:
Hi Jesse,

It’s the same/similar problem with how we handle search in the user interface isn’t it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It’s a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I’ve always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don’t need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.netmailto:[geonetwork-devel@lists.sourceforge.net](mailto:geonetwork-devel@lists.sourceforge.net)
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.commailto:[jesse.eichar@...189...](mailto:jesse.eichar@anonymised.com)<mailto:jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)>> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse


Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Wouldn’t it be an idea to start using SearcherManager in 2.9 ?

http://blog.mikemccandless.com/2011/09/lucenes-searchermanager-simplifies.html

Kind regards
Heikki Doeleman

On Tue, Sep 25, 2012 at 5:42 PM, Jesse Eichar <jesse.eichar@anonymised.com> wrote:

It turned out to not be too hard to back port. I just pushed the changes to 2.6.x as well.

Jesse

On Tue, Sep 25, 2012 at 5:06 PM, Jesse Eichar <jesse.eichar@anonymised.com…> wrote:

I have put it on the 2.8.x and master branches. I think it would be good on 2.6.x as well. I quickly evaluate the amount of work to do that.

Jesse

On Tue, Sep 25, 2012 at 5:02 PM, Jeroen Ticheler <jeroen.ticheler@anonymised.com> wrote:

Hi Jesse,
Good work! Is this something we should also port and commit to 2.6.x and make a bug fix release for that as well!? 2.6.5 is something that we should have released anyway in my opinion.
Cheers,
Jeroen

On 25 sep. 2012, at 16:44, Simon.Pigot@anonymised.com wrote:

Hi Jesse,

I think you should commit yours now as it’s a critical fix for 2.8.x - the Lucene 3.6 change is better suited for trunk - I do have a patch but it is an old version of GeoNetwork (2.4/2.6.x) and predates the Lucene multi-language indexing changes - I’ll adapt it and we can use it for trunk.

Speedy work on the fix - good one!

Cheers and thanks,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.com]
Sent: Wednesday, 26 September 2012 12:38 AM
To: Pigot, Simon (CMAR, Hobart)
Cc: geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

Do you have a patch for the required changes for lucene 3.6 somewhere?

I have just used JMeter to test the system and my proposed solution (from a performance POV) doesn’t seem to have much of an negative effect. So I would like to commit this solution for now. For the Lucene 3.6 fix it will be trivial to migrate my solution to use the token based solution that you mention.

Shall I commit?

Jesse

On Tue, Sep 25, 2012 at 4:34 PM, <Simon.Pigot@anonymised.commailto:[Simon.Pigot@anonymised.com2...](mailto:Simon.Pigot@anonymised.com)> wrote:
Hi Jesse,

It’s the same/similar problem with how we handle search in the user interface isn’t it? You have to keep the IndexReader open in the session as the user/csw session could return and continue the search for the next page/set of records. It’s a function of the lifecycle management for the IndexReader we use and it often results in an IndexReader being left open indefinitely which also then causes other problems (like open files resource depletion - caused by deleted files left open when Lucene uses an IndexWriter on the index (eg. when reindexing a record) - which I’ve always thought of as a bug in Lucene - and maybe memory usage).

One way around this is to use the lifecycle management for the IndexReader introduced in Lucene 3.6 whereby you get a token that you can store in the session and you don’t need to keep the IndexReader open/in session. There is also the capability to run a background task that prunes out moribund search sessions.

Cheers,
Simon


From: Jesse Eichar [jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)]
Sent: Tuesday, 25 September 2012 11:19 PM
To: Devel geonetwork-devel@lists.sourceforge.netmailto:[geonetwork-devel@lists.sourceforge.net](mailto:geonetwork-devel@lists.sourceforge.net)
Subject: Re: [GeoNetwork-devel] CSW SearchController and CatalogSearcher

My solution won’t work because there is a case where the reader will not be closed.

Personally I would like to remove the caching completely and have it stateless but there is some API that uses it. getAllUuids (used by select all) uses the cached readers for performing the select.

Probably a better solution would be to store the catalog searcher (or the searchController) on the user session.

My issue with that is the memory it uses.

A modification to this would be to have catalog searcher only cache the query as a field and open the searcher as needed. I think this should be sufficiently fast for most purposes.

Jesse

On Tue, Sep 25, 2012 at 2:40 PM, Jesse Eichar <jesse.eichar@anonymised.commailto:[jesse.eichar@...189...](mailto:jesse.eichar@anonymised.com)<mailto:jesse.eichar@anonymised.commailto:[jesse.eichar@anonymised.com](mailto:jesse.eichar@anonymised.com)>> wrote:
Hi,

I set up a jmeter test suite to stress test Geonetwork a little and I have found some severe issues with the CSW support. I will try to explain the issue:

User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request

User 2 makes CSW a different request
Catalog Searcher closes Index reader for last request and opens new reader

LuceneSearcher explodes because its index reader has been closed.

I think it is pretty clear what the problem is here but the solution is not as simple we cannot leave requests open.

One possible solution I have considered is:

Solution 1:
User 1 makes CSW request
Catalog Searcher (using LuceneSearcher) begins processing request 1

User 2 makes CSW a different request
Catalog Searcher places index reader on in a “toclose” set, opens new reader and begins request 2 with a LuceneSearcher

CatalogSearcher finishes request 1
CatalogSearch checks if index reader is in “toclose” set. It is in set so it closes the reader

CatalogSearcher finishes request 2
CatalogSearch checks if index reader is in “toclose” set. It is not in Set so it leaves reader open.

Any thoughts or conserations on this issue?

Jesse


Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork