[Geoserver-users] How to handle huge postgis database in geoserver..

We are working on a solution which possibly will be too big for a single postgis postgresql database because of size and number of simultaneous connections.

We are thinking of various solutions like splitting the database into multiple databases or using multicluster postgresql. Please do tell me on how to handle this cleanly with geoserver.

a) One option is to split the database into multiple smaller databases based on geographic region (eg. continent or a country). But this brings the lot of changes to the client to handle the region boundaries & for the feature crossing the region boundaries.

b) Is there a way to define a single featureType to be derived from multiple postgis sources inside geoserver? Currently single feature type is tied to single data source. Based on lat/lon, can I define the data source from different databases for a single feature type? If that works, I don't have change my client since all of them could be treated as single wfs or wms layer on the client side.

c) Another option we are thinking of is using some kind of replicated database cluster. Does anybody know what are the common solutions for this?

Thanks in advance..
Louvy Joseph

Hi Louvy,

Unfortunately there is no way to define a single feature type which is
made up of multiple data sources in the backend. You would some some
sort of proxying in the backend do make this transparent to GeoServer.

However... there has been some interesting work which has gone on
recently to build an "aggregating wfs client". I have not actually tried
it out but this is my understanding of how it works:

You define multiple geoserver instances with the same feature type. The
different instances store different data. What the aggregating client
does is aggregates all the different geoserver instances together making
it appear as a single instance. So theoretically you could set up
multiple instances for each database instance, set up the master
instance and have clients hit it.

Unfortunately this is not shipped with GeoServer out of the box. It
currently lives as a geotools datastore. It was contributed a while back
but not sure if it has gone anywhere. If we could get it as a module in
geotools then theoretically you could just plug it into geoserver.

Also, a recent post to the user list was with regard to clustering. The
user listed the following as candidate technologies. You may wish to
check them out.

Sequoia - http://sequoia.continuent.org
PgCluster / PgCluster II - http://pgfoundry.org/projects/pgcluster
Postgres-R - http://www.postgres-r.org
Bucardo - http://bucardo.org

Hope that helps.

-Justin

louvy.joseph@anonymised.com wrote:

We are working on a solution which possibly will be too big for a single
postgis postgresql database because of size and number of simultaneous
connections.

We are thinking of various solutions like splitting the database into
multiple databases or using multicluster postgresql. Please do tell me
on how to handle this cleanly with geoserver.

a) One option is to split the database into multiple smaller databases
based on geographic region (eg. continent or a country). But this
brings the lot of changes to the client to handle the region
boundaries & for the feature crossing the region boundaries.

b) Is there a way to define a single featureType to be derived from
multiple postgis sources inside geoserver? Currently single feature type
is tied to single data source. Based on lat/lon, can I define the data
source from different databases for a single feature type? If that
works, I don't have change my client since all of them could be treated
as single wfs or wms layer on the client side.

c) Another option we are thinking of is using some kind of replicated
database cluster. Does anybody know what are the common solutions for this?

Thanks in advance..
Louvy Joseph

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4740f397241525210051143!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

If this is read-only, no problem; there are various ways to replicate all
the middle tiers.

But there will probably never be a single 'silver bullet' for high-volume
DB activity, if any significant update/insert activity is going on, no
matter what database/middle tier technologies you are talking about. For
example, a single trigger, or check constraint, can instantly create a
system-wide hotspot, etc.

If you need to do anything more than just browse read-only data under high
volume, I would recommend a reality check about application-level
partitioning...

Chris

Hi Louvy,

Unfortunately there is no way to define a single feature type which is
made up of multiple data sources in the backend. You would some some
sort of proxying in the backend do make this transparent to GeoServer.

However... there has been some interesting work which has gone on
recently to build an "aggregating wfs client". I have not actually tried
it out but this is my understanding of how it works:

You define multiple geoserver instances with the same feature type. The
different instances store different data. What the aggregating client
does is aggregates all the different geoserver instances together making
it appear as a single instance. So theoretically you could set up
multiple instances for each database instance, set up the master
instance and have clients hit it.

Unfortunately this is not shipped with GeoServer out of the box. It
currently lives as a geotools datastore. It was contributed a while back
but not sure if it has gone anywhere. If we could get it as a module in
geotools then theoretically you could just plug it into geoserver.

Also, a recent post to the user list was with regard to clustering. The
user listed the following as candidate technologies. You may wish to
check them out.

Sequoia - http://sequoia.continuent.org
PgCluster / PgCluster II - http://pgfoundry.org/projects/pgcluster
Postgres-R - http://www.postgres-r.org
Bucardo - http://bucardo.org

Hope that helps.

-Justin

louvy.joseph@anonymised.com wrote:

We are working on a solution which possibly will be too big for a single
postgis postgresql database because of size and number of simultaneous
connections.

We are thinking of various solutions like splitting the database into
multiple databases or using multicluster postgresql. Please do tell me
on how to handle this cleanly with geoserver.

a) One option is to split the database into multiple smaller databases
based on geographic region (eg. continent or a country). But this
brings the lot of changes to the client to handle the region
boundaries & for the feature crossing the region boundaries.

b) Is there a way to define a single featureType to be derived from
multiple postgis sources inside geoserver? Currently single feature type
is tied to single data source. Based on lat/lon, can I define the data
source from different databases for a single feature type? If that
works, I don't have change my client since all of them could be treated
as single wfs or wms layer on the client side.

c) Another option we are thinking of is using some kind of replicated
database cluster. Does anybody know what are the common solutions for
this?

Thanks in advance..
Louvy Joseph

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4740f397241525210051143!

--
Justin Deoliveira
The Open Planning Project
http://topp.openplans.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users