[Geoserver-users] More Geoserver 2.8 debugging - please help! Someone!

Hi list

Ok, let me break this down. I have an Oracle Spatial database, which currently is home to about 6000 spatial tables. I’m currently running a production 2.6.2 Geoserver, which I’m still unable to upgrade to the latest 2.8.1 version.

The spatial tables are created in sets, where a specific user is then given access to the set through WFS. Until now, I have chosen to create a separate data store each time for these sets for easier administration.

But now, due to my previous failed attempts to get the server to start under 2.8.x with my current data directory, I’ve decided to try changing strategy: in my new data directory, I now only have 1 single data store, ie. one single connection to the database. This data store is at this moment home to almost 6000 layers, I tediously copied over from the previous configuration.

Now, I can actually get the server to boot - but it takes forever! Take a look at this log: in my latest run, the server took 1 hour and 10 minutes to start!

INFO | jvm 1 | 2015/12/09 16:14:31 | 09 dec 16:14:31 INFO [gwc.config] - Initializing GeoServer specific GWC configuration from gwc-gs.xml
INFO | jvm 1 | 2015/12/09 17:24:17 | 09 dec 17:24:17 INFO [ows.OWSHandlerMapping] - Mapped URL path [/gwc/service/**] onto handler ‘dispatcher’

Between those 2 lines, I have tried analyzing what’s actually going on. In the waiting time, I’ve run the following script against the database to analyze deeper what’s going on:

SELECT sid
, serial#
, username
, ( SELECT max( sql_text ) FROM v$sql sq WHERE sq.sql_id = se.sql_id ) AS sql_text
, ( SELECT object_name FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = 0) AS plsql_entry_object
, ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = plsql_entry_subprogram_id) AS plsql_entry_subprogram
, ( SELECT object_name FROM dba_procedures WHERE object_id = plsql_object_id AND subprogram_id = 0) AS plsql_entry_object
, ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_object_id AND subprogram_id = PLSQL_SUBPROGRAM_ID) AS plsql_entry_subprogram
, se.*
FROM v$session se
WHERE username = ‘(geoserver oracle user)’
– AND se.status = ‘ACTIVE’
– AND sid = 397
– AND plsql_entry_object_id IS NOT NULL
ORDER BY se.sid;

This query returns the queries currently being run against the database. An example of a query run in this period is here:

SELECT META.SDO_LAYER_GTYPE FROM ALL_INDEXES INFO INNER JOIN MDSYS.ALL_SDO_INDEX_METADATA META ON INFO.INDEX_NAME = META.SDO_INDEX_NAME WHERE INFO.TABLE_NAME = :1 AND REPLACE(meta.sdo_column_name, ‘"’) = :2

Can somebody please(!!) explain what’s going on here? Have anyone ever tested Geoserver against an Oracle 12c database? It seems to me, that it’s trying to validate all database tables connected to the Oracle layers. That’s fine by me - but not if it now makes the server take forever to start. Honestly, even though it’s a large amount of layers, it really shouldn’t take as much as 70 minutes to validate 6000 Oracle tables. It seems like you’re running some insanely heavy queries against the database at server startup.

Can this eventually be disabled by disabling the option “validate connections” on the data store? What exactly does this option do?

I really hope to hear from someone knowledgable about this.

Thanks in advance,

Jacob Madsen

You ask “Have anyone ever tested Geoserver against an Oracle 12c database?” and the answer is probably no! - We routinely produce Release Candidates for new releases of GeoServer and ask people to test them (especially if they run exotic databases like Oracle that the dev team doesn’t necessarily have the time, resources or inclination to test). So unless you stepped up and tested what sounds like a (to you) mission critical element then I doubt anyone else did.

So the short answer is save yourself the hassle of using an expensive DB that is known to not work well with spatial data and transfer to using PostGIS, feel free to think about contributing some of the license fee you save to GeoServer development.

···

The longer more complex answer is work with your debugger and Oracle DBAs to optimise the GeoTools Oracle datastore and help us fix these issues and in future test the RC releases so that you find out about an issue before we are in a release/maintenance period (and christmas) so that your fixes will take a month or two to filter out.

Ian

On 9 December 2015 at 16:49, Jacob Overgaard Madsen <u2jacob@anonymised.com> wrote:

Hi list

Ok, let me break this down. I have an Oracle Spatial database, which currently is home to about 6000 spatial tables. I’m currently running a production 2.6.2 Geoserver, which I’m still unable to upgrade to the latest 2.8.1 version.

The spatial tables are created in sets, where a specific user is then given access to the set through WFS. Until now, I have chosen to create a separate data store each time for these sets for easier administration.

But now, due to my previous failed attempts to get the server to start under 2.8.x with my current data directory, I’ve decided to try changing strategy: in my new data directory, I now only have 1 single data store, ie. one single connection to the database. This data store is at this moment home to almost 6000 layers, I tediously copied over from the previous configuration.

Now, I can actually get the server to boot - but it takes forever! Take a look at this log: in my latest run, the server took 1 hour and 10 minutes to start!

INFO | jvm 1 | 2015/12/09 16:14:31 | 09 dec 16:14:31 INFO [gwc.config] - Initializing GeoServer specific GWC configuration from gwc-gs.xml
INFO | jvm 1 | 2015/12/09 17:24:17 | 09 dec 17:24:17 INFO [ows.OWSHandlerMapping] - Mapped URL path [/gwc/service/**] onto handler ‘dispatcher’

Between those 2 lines, I have tried analyzing what’s actually going on. In the waiting time, I’ve run the following script against the database to analyze deeper what’s going on:

SELECT sid
, serial#
, username
, ( SELECT max( sql_text ) FROM v$sql sq WHERE sq.sql_id = se.sql_id ) AS sql_text
, ( SELECT object_name FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = 0) AS plsql_entry_object
, ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = plsql_entry_subprogram_id) AS plsql_entry_subprogram
, ( SELECT object_name FROM dba_procedures WHERE object_id = plsql_object_id AND subprogram_id = 0) AS plsql_entry_object
, ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_object_id AND subprogram_id = PLSQL_SUBPROGRAM_ID) AS plsql_entry_subprogram
, se.*
FROM v$session se
WHERE username = ‘(geoserver oracle user)’
– AND se.status = ‘ACTIVE’
– AND sid = 397
– AND plsql_entry_object_id IS NOT NULL
ORDER BY se.sid;

This query returns the queries currently being run against the database. An example of a query run in this period is here:

SELECT META.SDO_LAYER_GTYPE FROM ALL_INDEXES INFO INNER JOIN MDSYS.ALL_SDO_INDEX_METADATA META ON INFO.INDEX_NAME = META.SDO_INDEX_NAME WHERE INFO.TABLE_NAME = :1 AND REPLACE(meta.sdo_column_name, ‘"’) = :2

Can somebody please(!!) explain what’s going on here? Have anyone ever tested Geoserver against an Oracle 12c database? It seems to me, that it’s trying to validate all database tables connected to the Oracle layers. That’s fine by me - but not if it now makes the server take forever to start. Honestly, even though it’s a large amount of layers, it really shouldn’t take as much as 70 minutes to validate 6000 Oracle tables. It seems like you’re running some insanely heavy queries against the database at server startup.

Can this eventually be disabled by disabling the option “validate connections” on the data store? What exactly does this option do?

I really hope to hear from someone knowledgable about this.

Thanks in advance,

Jacob Madsen



Geoserver-users mailing list
Geoserver-users@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Ian Turton

On Wed, Dec 9, 2015 at 6:45 PM, Ian Turton <ijturton@anonymised.com> wrote:

You ask "Have anyone ever tested Geoserver against an Oracle 12c
database?" and the answer is probably no! - We routinely produce Release
Candidates for new releases of GeoServer and ask people to test them
(especially if they run exotic databases like Oracle that the dev team
doesn't necessarily have the time, resources or inclination to test). So
unless you stepped up and tested what sounds like a (to you) mission
critical element then I doubt anyone else did.

Hi Ian,
I think someone is using it fine, but also believe they are specifying a
target schema in the store
configuration, it's when one doesn't that the Oracle JDBC driver starts
making those absurdly slow queries (it's not
GeoServer that does them, it's access to the JDBCMetadata class plus some
really poor implementation
in the Oracle JDBC driver).

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

*Geosolutions' Winter Holidays from 24/12 to 6/1*

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

Hello Andrea

I believe you are talking about this issue: https://osgeo-org.atlassian.net/browse/GEOT-4188, right?

Here is what I have found out by spending yet another day doing some deeper analysis of what’s going on here.

May I ask, if this isn’t a bug? Shouldn’t Geoserver retrieve all column information from the GEOMETRY_COLUMNS table and not from the data dictionary? It seems like column information is fetched from BOTH getColumns() and my GEOMETRY_COLUMNS table - which in fact is a materialized views for performance’s sake.

  • Another thing I’ve found: by using materialized views as wrappers to the data dictionary views accessed by Geoserver or the Oracle JDBC driver (ALL_TAB_COLUMNS, ALL_SYNONYMS, ALL_OBJECTS), I’ve been able to bring down the startup time from 70 minutes to 20 minutes. It’s a big improvement, but still very long time for a startup.

All I’d really like to know this time is, whether or not I’ve misunderstood the usage of the GEOMETRY_COLUMNS value in datastore.xml? As far as I understand the documentation, Geoserver should ONLY look in this table and not in the data dictionary, when this is defined for a data store. Am I wrong here? Isn’t this option there for performance reasons?

Thanks in advance,

Jacob Madsen

···

2015-12-09 20:32 GMT+01:00 Andrea Aime <andrea.aime@anonymised.com>:

On Wed, Dec 9, 2015 at 6:45 PM, Ian Turton <ijturton@anonymised.com> wrote:

You ask “Have anyone ever tested Geoserver against an Oracle 12c database?” and the answer is probably no! - We routinely produce Release Candidates for new releases of GeoServer and ask people to test them (especially if they run exotic databases like Oracle that the dev team doesn’t necessarily have the time, resources or inclination to test). So unless you stepped up and tested what sounds like a (to you) mission critical element then I doubt anyone else did.

Hi Ian,
I think someone is using it fine, but also believe they are specifying a target schema in the store
configuration, it’s when one doesn’t that the Oracle JDBC driver starts making those absurdly slow queries (it’s not
GeoServer that does them, it’s access to the JDBCMetadata class plus some really poor implementation
in the Oracle JDBC driver).

Cheers
Andrea

==
GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Geosolutions’ Winter Holidays from 24/12 to 6/1

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.