[Geoserver-devel] On the buffer strategy problems (and more in general, dependence to poorly implemented 1.4 classes)

HI all,
the current buffer strategy uses an internal ByteArrayOutputStream of 1MB
to perform buffering. This is bad in two ways:
* 1MB is a lot, typical wms requerest are not that big (wfs may be). We are
   wasting memory and limiting scalability significantly like this.
* during flush it allocates another buffered output stream with 1MB buffer
   just to write out the entire content of the bytearrayoutputstream. This
   is simply useless, I've already removed it on trunk.
* ByteArrayOutputStream expands the buffer by the exact number of bytes
   needed istead of doubling it.This is a performance nightmare!
   So bad that there's plenty of FastByteArrayOutputStream implementations
   around (common, jetty, struts 2 just to cite a few).

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster, but we're
stuck on 1.4, right?)

What do you think?
Cheers
Andrea

I don't see any problems with moving to the new library. However at some
point we will be moving to java 1.5. When it becomes the de facto
standard for J2EE I think would be a good time. However most of the
servlet contains we are using already support 1.5.

-Justin

aaime@anonymised.com wrote:

HI all,
the current buffer strategy uses an internal ByteArrayOutputStream of 1MB
to perform buffering. This is bad in two ways:
* 1MB is a lot, typical wms requerest are not that big (wfs may be). We are
   wasting memory and limiting scalability significantly like this.
* during flush it allocates another buffered output stream with 1MB buffer
   just to write out the entire content of the bytearrayoutputstream. This
   is simply useless, I've already removed it on trunk.
* ByteArrayOutputStream expands the buffer by the exact number of bytes
   needed istead of doubling it.This is a performance nightmare!
   So bad that there's plenty of FastByteArrayOutputStream implementations
   around (common, jetty, struts 2 just to cite a few).

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster,
but we're
stuck on 1.4, right?)

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1004,452cc0d224631804284693!

--
Justin Deoliveira
The Open Planning Project
jdeolive@anonymised.com

aaime@anonymised.com wrote:

HI all,
the current buffer strategy uses an internal ByteArrayOutputStream of 1MB
to perform buffering. This is bad in two ways:
* 1MB is a lot, typical wms requerest are not that big (wfs may be). We are
   wasting memory and limiting scalability significantly like this.

Yeah, that sounds like a lot to me, I was under the impression it was more like 50k. WFS responses can definitely be that big and much bigger, but most all errors are in the first bits of the WFS response, so I don't think we need such a large buffer.

* during flush it allocates another buffered output stream with 1MB buffer
   just to write out the entire content of the bytearrayoutputstream. This
   is simply useless, I've already removed it on trunk.

Cool.

* ByteArrayOutputStream expands the buffer by the exact number of bytes
   needed istead of doubling it.This is a performance nightmare!
   So bad that there's plenty of FastByteArrayOutputStream implementations
   around (common, jetty, struts 2 just to cite a few).

That sounds good.

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster, but we're
stuck on 1.4, right?)

Yes, this seems like a good plan. We definitely want to stay on 1.4 for awhile. As long as it's not a huge dependency, it should be fine.

best regards,

Chris

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,452cc0d224591410093335!

--
Chris Holmes
The Open Planning Project
http://topp.openplans.org

I'd go for a bit of a different goal than 1.5 as de facto standard for j2ee. I'd say when there's a good open source jre/jdk. My big concern is making it impossible for some operating systems to use it. Apache harmony and gnu classpath are making good progress though, so I think it should be possible to move to 1.5. And of course sun is always making motions to open source java, but it's not happened. But I think we should wait till one of those, plus the j2ee on 1.5.

best regards,

Chris

Justin Deoliveira wrote:

I don't see any problems with moving to the new library. However at some
point we will be moving to java 1.5. When it becomes the de facto
standard for J2EE I think would be a good time. However most of the
servlet contains we are using already support 1.5.

-Justin

aaime@anonymised.com wrote:

HI all,
the current buffer strategy uses an internal ByteArrayOutputStream of 1MB
to perform buffering. This is bad in two ways:
* 1MB is a lot, typical wms requerest are not that big (wfs may be). We are
   wasting memory and limiting scalability significantly like this.
* during flush it allocates another buffered output stream with 1MB buffer
   just to write out the entire content of the bytearrayoutputstream. This
   is simply useless, I've already removed it on trunk.
* ByteArrayOutputStream expands the buffer by the exact number of bytes
   needed istead of doubling it.This is a performance nightmare!
   So bad that there's plenty of FastByteArrayOutputStream implementations
   around (common, jetty, struts 2 just to cite a few).

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster, but we're
stuck on 1.4, right?)

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Chris Holmes
The Open Planning Project
http://topp.openplans.org

If it is the partial buffer strategy, then the value is read from the web.xml file. It defaults to 50kb.

Brent Owens
(The Open Planning Project)

Chris Holmes wrote:

aaime@anonymised.com wrote:

HI all,
the current buffer strategy uses an internal ByteArrayOutputStream of 1MB
to perform buffering. This is bad in two ways:
* 1MB is a lot, typical wms requerest are not that big (wfs may be). We are
   wasting memory and limiting scalability significantly like this.

Yeah, that sounds like a lot to me, I was under the impression it was more like 50k. WFS responses can definitely be that big and much bigger, but most all errors are in the first bits of the WFS response, so I don't think we need such a large buffer.

* during flush it allocates another buffered output stream with 1MB buffer
   just to write out the entire content of the bytearrayoutputstream. This
   is simply useless, I've already removed it on trunk.

Cool.

* ByteArrayOutputStream expands the buffer by the exact number of bytes
   needed istead of doubling it.This is a performance nightmare!
   So bad that there's plenty of FastByteArrayOutputStream implementations
   around (common, jetty, struts 2 just to cite a few).

That sounds good.

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster, but we're
stuck on 1.4, right?)

Yes, this seems like a good plan. We definitely want to stay on 1.4 for awhile. As long as it's not a huge dependency, it should be fine.

best regards,

Chris

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,452cc0d224591410093335!

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
------------------------------------------------------------------------

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  

Brent Owens ha scritto:

If it is the partial buffer strategy, then the value is read from the web.xml file. It defaults to 50kb.
  

No, I was speaking about buffer strategy, and here the buffer initial size
can't be modified.
Cheers
Andrea

Chris Holmes ha scritto:

Now, Simone introduced me to the excellent MG4J library, that also has
this kind of buffer, along with faster string builders and the like.
Since we are stuck on 1.4, maybe we should add a dependency
on it and start using more performant replacements to the poorly implemented
1.4 classes? (on 1.5 some classes, like StringBuffer, are lots faster, but we're
stuck on 1.4, right?)

Yes, this seems like a good plan. We definitely want to stay on 1.4 for awhile. As long as it's not a huge dependency, it should be fine.

Not super small either.
If we want only the FastByteArrayOutputStream, we can go for commons-io which is only 70KB afaik,
whilst MG4J is around 700KB (but provides other useful classes as well).
I also have to check dependencies, and... oh, never mind... just noticed that during one of the
last releases MG4J went java 5 only (they don't say so in the home page, Simone told me something about
it but I forgot...).

Ok, let's go commons-io then?
Cheers
Andrea

I spoke with the guy
behind MG4J about the depency from 1.4. If we want he can release a
version compatible with 1.4

Simone.
On 10/12/06, aaime@anonymised.com <aaime@anonymised.com> wrote:

Chris Holmes ha scritto:
>> Now, Simone introduced me to the excellent MG4J library, that also has
>> this kind of buffer, along with faster string builders and the like.
>> Since we are stuck on 1.4, maybe we should add a dependency
>> on it and start using more performant replacements to the poorly
>> implemented
>> 1.4 classes? (on 1.5 some classes, like StringBuffer, are lots
>> faster, but we're
>> stuck on 1.4, right?)
>
> Yes, this seems like a good plan. We definitely want to stay on 1.4
> for awhile. As long as it's not a huge dependency, it should be fine.
Not super small either.
If we want only the FastByteArrayOutputStream, we can go for commons-io
which is only 70KB afaik,
whilst MG4J is around 700KB (but provides other useful classes as well).
I also have to check dependencies, and... oh, never mind... just noticed
that during one of the
last releases MG4J went java 5 only (they don't say so in the home page,
Simone told me something about
it but I forgot...).

Ok, let's go commons-io then?
Cheers
Andrea

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
-------------------------------------------------------
Eng. Simone Giannecchini
President /CEO GeoSolutions

http://www.geo-solutions.it

-------------------------------------------------------

A remark.

I think that we should not look at the size of the lib but at what it gives us.

MMG4J provides mutable strings that would be great for KML and for
logging purposes as an instance.

SImone.

On 10/12/06, Simone Giannecchini <simboss1@anonymised.com> wrote:

I spoke with the guy
behind MG4J about the depency from 1.4. If we want he can release a
version compatible with 1.4

Simone.
On 10/12/06, aaime@anonymised.com <aaime@anonymised.com> wrote:
> Chris Holmes ha scritto:
> >> Now, Simone introduced me to the excellent MG4J library, that also has
> >> this kind of buffer, along with faster string builders and the like.
> >> Since we are stuck on 1.4, maybe we should add a dependency
> >> on it and start using more performant replacements to the poorly
> >> implemented
> >> 1.4 classes? (on 1.5 some classes, like StringBuffer, are lots
> >> faster, but we're
> >> stuck on 1.4, right?)
> >
> > Yes, this seems like a good plan. We definitely want to stay on 1.4
> > for awhile. As long as it's not a huge dependency, it should be fine.
> Not super small either.
> If we want only the FastByteArrayOutputStream, we can go for commons-io
> which is only 70KB afaik,
> whilst MG4J is around 700KB (but provides other useful classes as well).
> I also have to check dependencies, and... oh, never mind... just noticed
> that during one of the
> last releases MG4J went java 5 only (they don't say so in the home page,
> Simone told me something about
> it but I forgot...).
>
> Ok, let's go commons-io then?
> Cheers
> Andrea
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Geoserver-devel mailing list
> Geoserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>

--
-------------------------------------------------------
Eng. Simone Giannecchini
President /CEO GeoSolutions

http://www.geo-solutions.it

-------------------------------------------------------

--
-------------------------------------------------------
Eng. Simone Giannecchini
President /CEO GeoSolutions

http://www.geo-solutions.it

-------------------------------------------------------

Simone Giannecchini ha scritto:

A remark.

I think that we should not look at the size of the lib but at what it gives us.
  

In fact geoserver is a server app, not an applet.

MMG4J provides mutable strings that would be great for KML and for
logging purposes as an instance.
  

Strange examples thougth.... KML should be written directly to a stream IMHO, not built into a String.
Logging, hum... now I lost you.

Cheers
Andrea

When you log you usually built very long string, usually converting
objects into string. Often people do this by using the concatenation
on strings (which is a peformance nightmare). 1.5 will help but still
stringbuffer append is synchronized (which is 99.99% of the time
unneeded) which means overhead.

Logging has to be done carefully or it can kill a server.

Simone.

On 10/12/06, aaime@anonymised.com <aaime@anonymised.com> wrote:

Simone Giannecchini ha scritto:
> A remark.
>
> I think that we should not look at the size of the lib but at what it gives us.
>
In fact geoserver is a server app, not an applet.
> MMG4J provides mutable strings that would be great for KML and for
> logging purposes as an instance.
>
Strange examples thougth.... KML should be written directly to a stream
IMHO, not built into a String.
Logging, hum... now I lost you.

Cheers
Andrea

--
-------------------------------------------------------
Eng. Simone Giannecchini
President /CEO GeoSolutions

http://www.geo-solutions.it

-------------------------------------------------------

aaime@anonymised.com wrote:

Brent Owens ha scritto:

If it is the partial buffer strategy, then the value is read from the web.xml file. It defaults to 50kb.
  

No, I was speaking about buffer strategy, and here the buffer initial size
can't be modified.

We might just consider dropping the buffer strategy. I mean, it sounds like if someone wanted a memory buffer of that size they could just use partial buffer with a large value set? I think the two are kind of redundant.

Chris

Cheers
Andrea

!DSPAM:1003,452ddd92182849771116852!

--
Chris Holmes
The Open Planning Project
http://topp.openplans.org