[Geoserver-users] problem with large WFS-T insert

Hi All,

I am testing the possibilities of using WFS-T to exchange data between to databases (I request data from one database over WFS and send it to another using WFS-T). I use Geoserver as the WFS/WFS-T server and send the XML-encoded request there using POST. Although it is working with a small number of records, if I sent ‘a lot’ of records, in my test case 2000 (which actually still is a fraction of the amount I would like to send), the request isn’t correctly processed and I get an java.lang.OutOfMemoryError. It seems to me that somewhere Geoserver tries to parse the entire XML document (3.5 MByte with 2000 records in a single transaction) but fails in this because it want’s more memory. I could try to increase the memory settings but that would probably only get me a bit further (and not say a factor of 10 or 100).

Am I doing something wrong here? Is there perhaps a setting which makes Geoserver parse the XML bit by bit and set completed elements to the database? I did read something about serviceStrategy in the web.xml file. And that is set to PARTIAL-BUFFER2 in my case (was the default).

I hope you can help me.

Kind regards,

Maarten Vermeij

Hi,

Yes, in a transaction request GeoServer will read and parse the entire document. However... 2000 does not seem like it would be too too much to parse... i wonder if the problem is somwehere else.

It would be possible to stream out the individual transactions inside of the request in a more efficient way. Is there any chance you can put up the file (zipped) you are using the insert the data? Then i could test it on my machine.

-Justin

Vermeij, Maarten wrote:

Hi All,

I am testing the possibilities of using WFS-T to exchange data between to databases (I request data from one database over WFS and send it to another using WFS-T). I use Geoserver as the WFS/WFS-T server and send the XML-encoded request there using POST. Although it is working with a small number of records, if I sent ‘a lot’ of records, in my test case 2000 (which actually still is a fraction of the amount I would like to send), the request isn’t correctly processed and I get an java.lang.OutOfMemoryError. It seems to me that somewhere Geoserver tries to parse the entire XML document (3.5 MByte with 2000 records in a single transaction) but fails in this because it want’s more memory. I could try to increase the memory settings but that would probably only get me a bit further (and not say a factor of 10 or 100).

Am I doing something wrong here? Is there perhaps a setting which makes Geoserver parse the XML bit by bit and set completed elements to the database? I did read something about serviceStrategy in the web.xml file. And that is set to PARTIAL-BUFFER2 in my case (was the default).

I hope you can help me.

Kind regards,

Maarten Vermeij

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4808883888975409313003!

--
Justin Deoliveira
The Open Planning Project
jdeolive@anonymised.com

Hello Justin,

Thanks for your comments. Unfortunately I cannot put provided the data
as is, at the moment since it's cadastral data. I will first have to
anonimize/randomize it a bit. I will do this shortly and put it on a
website. I can combine this with the table definitions I use (it's in
PostGIS at the moment).

The table I am testing with has the following table definitions (taken
from PGAdmin). Nothing special as far as I can see.

CREATE TABLE mvparcel
(
  vest_cd character(2),
  slc integer,
  ogroup integer,
  object_id integer,
  classif integer,
  z integer,
  rotangle integer,
  accu_cd integer,
  oarea double precision,
  object_dt integer,
  tmin timestamp without time zone,
  tmax timestamp without time zone,
  sel_cd character(3),
  source character(5),
  quality character(2),
  vis_cd character(1),
  akr_area double precision,
  municip character(5),
  osection character(2),
  sheet character(4),
  parcel character(5),
  pp_i_ltr character(1),
  pp_i_nr character(4),
  l_num integer,
  line_id1 integer,
  line_id2 integer,
  x_akr_objectnummer character(17),
  myid serial NOT NULL,
  mlocation geometry,
  d_location geometry,
  bbox geometry,
  CONSTRAINT mvparcel_pkey PRIMARY KEY (myid),
  CONSTRAINT enforce_dims_bbox CHECK (ndims(bbox) = 2),
  CONSTRAINT enforce_dims_d_location CHECK (ndims(d_location) = 2),
  CONSTRAINT enforce_dims_mlocation CHECK (ndims(mlocation) = 2),
  CONSTRAINT enforce_geotype_bbox CHECK (geometrytype(bbox) =
'POLYGON'::text OR bbox IS NULL),
  CONSTRAINT enforce_geotype_d_location CHECK (geometrytype(d_location)
= 'POINT'::text OR d_location IS NULL),
  CONSTRAINT enforce_geotype_mlocation CHECK (geometrytype(mlocation) =
'POINT'::text OR mlocation IS NULL),
  CONSTRAINT enforce_srid_bbox CHECK (srid(bbox) = 28992),
  CONSTRAINT enforce_srid_d_location CHECK (srid(d_location) = 28992),
  CONSTRAINT enforce_srid_mlocation CHECK (srid(mlocation) = 28992)
)
WITH (OIDS=TRUE);
ALTER TABLE mvparcel OWNER TO vermeij;
GRANT ALL ON TABLE mvparcel TO vermeij;

Kind regards,

Maarten Vermeij

-----Original Message-----
From: Justin Deoliveira [mailto:jdeolive@anonymised.com]
Sent: Friday, April 18, 2008 16:46
To: Vermeij, Maarten
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] problem with large WFS-T insert

Hi,

Yes, in a transaction request GeoServer will read and parse the entire
document. However... 2000 does not seem like it would be too too much to

parse... i wonder if the problem is somwehere else.

It would be possible to stream out the individual transactions inside of

the request in a more efficient way. Is there any chance you can put up
the file (zipped) you are using the insert the data? Then i could test
it on my machine.

-Justin

Vermeij, Maarten wrote:

Hi All,

I am testing the possibilities of using WFS-T to exchange data

between

to databases (I request data from one database over WFS and send it to

another using WFS-T). I use Geoserver as the WFS/WFS-T server and send

the XML-encoded request there using POST. Although it is working with

a

small number of records, if I sent 'a lot' of records, in my test case

2000 (which actually still is a fraction of the amount I would like to

send), the request isn't correctly processed and I get an
java.lang.OutOfMemoryError. It seems to me that somewhere Geoserver
tries to parse the entire XML document (3.5 MByte with 2000 records in

a

single transaction) but fails in this because it want's more memory. I

could try to increase the memory settings but that would probably only

get me a bit further (and not say a factor of 10 or 100).

Am I doing something wrong here? Is there perhaps a setting which

makes

Geoserver parse the XML bit by bit and set completed elements to the
database? I did read something about serviceStrategy in the web.xml
file. And that is set to PARTIAL-BUFFER2 in my case (was the default).

I hope you can help me.

Kind regards,

Maarten Vermeij

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

------------------------------------------------------------------------
-

This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save

$100.

Use priority code J8TL2D2.

http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
avaone

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4808883888975409313003!

--
Justin Deoliveira
The Open Planning Project
jdeolive@anonymised.com

Hello Justin,

I have prepared the data for external use.

A zip file containing the xml-encoded request for the WFS-T:
http://casagrande.otb.tudelft.nl/maarten/wfstest/casa_in_test.zip

The returned xml from the WFS indicating the error:
http://casagrande.otb.tudelft.nl/maarten/wfstest/casa_in_test_return.xml

The SRID for the data is 28992 (the Dutch national reference system).
And the bbox I use for this dataset is: min long 3.0 min lat 50.0 max
long 7.0 max lat 54.0.

I don't see anything weird in the input file, except may be it's size.
This file contains 2000 records for insertion, within a single insert
statement within a single transaction. I also tried giving each record
it's own insert statement(but still one transaction), but that didn't
help.

If you know of any additional information that could be of use, let me
know.

Kind regards,

Maarten Vermeij

-----Original Message-----
From: geoserver-users-bounces@lists.sourceforge.net
[mailto:geoserver-users-bounces@lists.sourceforge.net] On Behalf Of
Vermeij, Maarten
Sent: Friday, April 18, 2008 16:55
To: Justin Deoliveira
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] problem with large WFS-T insert

Hello Justin,

Thanks for your comments. Unfortunately I cannot put provided the data
as is, at the moment since it's cadastral data. I will first have to
anonimize/randomize it a bit. I will do this shortly and put it on a
website. I can combine this with the table definitions I use (it's in
PostGIS at the moment).

The table I am testing with has the following table definitions (taken
from PGAdmin). Nothing special as far as I can see.

CREATE TABLE mvparcel
(
  vest_cd character(2),
  slc integer,
  ogroup integer,
  object_id integer,
  classif integer,
  z integer,
  rotangle integer,
  accu_cd integer,
  oarea double precision,
  object_dt integer,
  tmin timestamp without time zone,
  tmax timestamp without time zone,
  sel_cd character(3),
  source character(5),
  quality character(2),
  vis_cd character(1),
  akr_area double precision,
  municip character(5),
  osection character(2),
  sheet character(4),
  parcel character(5),
  pp_i_ltr character(1),
  pp_i_nr character(4),
  l_num integer,
  line_id1 integer,
  line_id2 integer,
  x_akr_objectnummer character(17),
  myid serial NOT NULL,
  mlocation geometry,
  d_location geometry,
  bbox geometry,
  CONSTRAINT mvparcel_pkey PRIMARY KEY (myid),
  CONSTRAINT enforce_dims_bbox CHECK (ndims(bbox) = 2),
  CONSTRAINT enforce_dims_d_location CHECK (ndims(d_location) = 2),
  CONSTRAINT enforce_dims_mlocation CHECK (ndims(mlocation) = 2),
  CONSTRAINT enforce_geotype_bbox CHECK (geometrytype(bbox) =
'POLYGON'::text OR bbox IS NULL),
  CONSTRAINT enforce_geotype_d_location CHECK (geometrytype(d_location)
= 'POINT'::text OR d_location IS NULL),
  CONSTRAINT enforce_geotype_mlocation CHECK (geometrytype(mlocation) =
'POINT'::text OR mlocation IS NULL),
  CONSTRAINT enforce_srid_bbox CHECK (srid(bbox) = 28992),
  CONSTRAINT enforce_srid_d_location CHECK (srid(d_location) = 28992),
  CONSTRAINT enforce_srid_mlocation CHECK (srid(mlocation) = 28992)
)
WITH (OIDS=TRUE);
ALTER TABLE mvparcel OWNER TO vermeij;
GRANT ALL ON TABLE mvparcel TO vermeij;

Kind regards,

Maarten Vermeij

-----Original Message-----
From: Justin Deoliveira [mailto:jdeolive@anonymised.com]
Sent: Friday, April 18, 2008 16:46
To: Vermeij, Maarten
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] problem with large WFS-T insert

Hi,

Yes, in a transaction request GeoServer will read and parse the entire
document. However... 2000 does not seem like it would be too too much to

parse... i wonder if the problem is somwehere else.

It would be possible to stream out the individual transactions inside of

the request in a more efficient way. Is there any chance you can put up
the file (zipped) you are using the insert the data? Then i could test
it on my machine.

-Justin

Vermeij, Maarten wrote:

Hi All,

I am testing the possibilities of using WFS-T to exchange data

between

to databases (I request data from one database over WFS and send it to

another using WFS-T). I use Geoserver as the WFS/WFS-T server and send

the XML-encoded request there using POST. Although it is working with

a

small number of records, if I sent 'a lot' of records, in my test case

2000 (which actually still is a fraction of the amount I would like to

send), the request isn't correctly processed and I get an
java.lang.OutOfMemoryError. It seems to me that somewhere Geoserver
tries to parse the entire XML document (3.5 MByte with 2000 records in

a

single transaction) but fails in this because it want's more memory. I

could try to increase the memory settings but that would probably only

get me a bit further (and not say a factor of 10 or 100).

Am I doing something wrong here? Is there perhaps a setting which

makes

Geoserver parse the XML bit by bit and set completed elements to the
database? I did read something about serviceStrategy in the web.xml
file. And that is set to PARTIAL-BUFFER2 in my case (was the default).

I hope you can help me.

Kind regards,

Maarten Vermeij

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

------------------------------------------------------------------------
-

This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save

$100.

Use priority code J8TL2D2.

http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
avaone

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4808883888975409313003!

--
Justin Deoliveira
The Open Planning Project
jdeolive@anonymised.com

------------------------------------------------------------------------
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
avaone
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Hi Maarten,

Thanks for providing some data. I looked into it and was able to reproduce. Unfortunately the fix is not trivial... I have created an issue report for this:

http://jira.codehaus.org/browse/GEOS-1884

However.. in the meantime the only thing i can suggest is that you break up your transaction into multiple smaller transactions.

-Justin

Vermeij, Maarten wrote:

Hello Justin,

I have prepared the data for external use.

A zip file containing the xml-encoded request for the WFS-T:
http://casagrande.otb.tudelft.nl/maarten/wfstest/casa_in_test.zip

The returned xml from the WFS indicating the error:
http://casagrande.otb.tudelft.nl/maarten/wfstest/casa_in_test_return.xml

The SRID for the data is 28992 (the Dutch national reference system).
And the bbox I use for this dataset is: min long 3.0 min lat 50.0 max
long 7.0 max lat 54.0.

I don't see anything weird in the input file, except may be it's size.
This file contains 2000 records for insertion, within a single insert
statement within a single transaction. I also tried giving each record
it's own insert statement(but still one transaction), but that didn't
help.

If you know of any additional information that could be of use, let me
know.

Kind regards,

Maarten Vermeij

-----Original Message-----
From: geoserver-users-bounces@lists.sourceforge.net
[mailto:geoserver-users-bounces@lists.sourceforge.net] On Behalf Of
Vermeij, Maarten
Sent: Friday, April 18, 2008 16:55
To: Justin Deoliveira
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] problem with large WFS-T insert

Hello Justin,

Thanks for your comments. Unfortunately I cannot put provided the data
as is, at the moment since it's cadastral data. I will first have to
anonimize/randomize it a bit. I will do this shortly and put it on a
website. I can combine this with the table definitions I use (it's in
PostGIS at the moment).

The table I am testing with has the following table definitions (taken
from PGAdmin). Nothing special as far as I can see.

CREATE TABLE mvparcel
(
  vest_cd character(2),
  slc integer,
  ogroup integer,
  object_id integer,
  classif integer,
  z integer,
  rotangle integer,
  accu_cd integer,
  oarea double precision,
  object_dt integer,
  tmin timestamp without time zone,
  tmax timestamp without time zone,
  sel_cd character(3),
  source character(5),
  quality character(2),
  vis_cd character(1),
  akr_area double precision,
  municip character(5),
  osection character(2),
  sheet character(4),
  parcel character(5),
  pp_i_ltr character(1),
  pp_i_nr character(4),
  l_num integer,
  line_id1 integer,
  line_id2 integer,
  x_akr_objectnummer character(17),
  myid serial NOT NULL,
  mlocation geometry,
  d_location geometry,
  bbox geometry,
  CONSTRAINT mvparcel_pkey PRIMARY KEY (myid),
  CONSTRAINT enforce_dims_bbox CHECK (ndims(bbox) = 2),
  CONSTRAINT enforce_dims_d_location CHECK (ndims(d_location) = 2),
  CONSTRAINT enforce_dims_mlocation CHECK (ndims(mlocation) = 2),
  CONSTRAINT enforce_geotype_bbox CHECK (geometrytype(bbox) =
'POLYGON'::text OR bbox IS NULL),
  CONSTRAINT enforce_geotype_d_location CHECK (geometrytype(d_location)
= 'POINT'::text OR d_location IS NULL),
  CONSTRAINT enforce_geotype_mlocation CHECK (geometrytype(mlocation) =
'POINT'::text OR mlocation IS NULL),
  CONSTRAINT enforce_srid_bbox CHECK (srid(bbox) = 28992),
  CONSTRAINT enforce_srid_d_location CHECK (srid(d_location) = 28992),
  CONSTRAINT enforce_srid_mlocation CHECK (srid(mlocation) = 28992)
)
WITH (OIDS=TRUE);
ALTER TABLE mvparcel OWNER TO vermeij;
GRANT ALL ON TABLE mvparcel TO vermeij;

Kind regards,

Maarten Vermeij

-----Original Message-----
From: Justin Deoliveira [mailto:jdeolive@anonymised.com] Sent: Friday, April 18, 2008 16:46
To: Vermeij, Maarten
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] problem with large WFS-T insert

Hi,

Yes, in a transaction request GeoServer will read and parse the entire document. However... 2000 does not seem like it would be too too much to

parse... i wonder if the problem is somwehere else.

It would be possible to stream out the individual transactions inside of

the request in a more efficient way. Is there any chance you can put up the file (zipped) you are using the insert the data? Then i could test it on my machine.

-Justin

Vermeij, Maarten wrote:

Hi All,

I am testing the possibilities of using WFS-T to exchange data

between

to databases (I request data from one database over WFS and send it to

another using WFS-T). I use Geoserver as the WFS/WFS-T server and send

the XML-encoded request there using POST. Although it is working with

a

small number of records, if I sent 'a lot' of records, in my test case

2000 (which actually still is a fraction of the amount I would like to

send), the request isn't correctly processed and I get an java.lang.OutOfMemoryError. It seems to me that somewhere Geoserver tries to parse the entire XML document (3.5 MByte with 2000 records in

a

single transaction) but fails in this because it want's more memory. I

could try to increase the memory settings but that would probably only

get me a bit further (and not say a factor of 10 or 100).

Am I doing something wrong here? Is there perhaps a setting which

makes

Geoserver parse the XML bit by bit and set completed elements to the database? I did read something about serviceStrategy in the web.xml file. And that is set to PARTIAL-BUFFER2 in my case (was the default).

I hope you can help me.

Kind regards,

Maarten Vermeij

------------------------------------------------------------------------

------------------------------------------------------------------------
-

This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save

$100.

Use priority code J8TL2D2.

http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
avaone

!DSPAM:4007,4808883888975409313003!

------------------------------------------------------------------------

_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

!DSPAM:4007,4808883888975409313003!

--
Justin Deoliveira
The Open Planning Project
jdeolive@anonymised.com