Hi,
About absurdly large GDAL overview file, it is so big because you ask GDAL to create it for you as uncompressed. Gdaladdo document page has an example about how to create compressed overviews.
Page: http://www.gdal.org/gdaladdo.html
Example:
gdaladdo --config COMPRESS_OVERVIEW JPEG --config PHOTOMETRIC_OVERVIEW YCBCR
--config INTERLEAVE_OVERVIEW PIXEL rgb_dataset.ext 2 4 8 16
That's good for aerial images. For Ordnance Survey rasters I guess --config COMPRESS_OVERVIEW DEFLATE would suit best.
About number of levels, if your original has 160000 pixels start dividing it by 2:
80000, 40000, 20000, 10000, 5000, 2500, 1250, 625, 312, 156 STOP
Thus, create 10 (or 9) overview levels starting as 2 4 8...
I am not sure but I think that 1 2 4 8 ... will lead to the same result because factor 1 overview level exists and it will not be created at least as internal overviews (the original data) but you can easily test it yourself. Anyway, factor 1 overview does not make sense so you can safely start from 2 just as you were thinking.
-Jukka Rahkonen-
________________________________
Lähettäjä: Jonathan Moules [jonathanmoules@anonymised.com]
Lähetetty: 2. tammikuuta 2013 18:12
Vastaanottaja: Simone Giannecchini
Kopio: geoserver-users@lists.sourceforge.net
Aihe: Re: [Geoserver-users] Rasters, Tiles, and FME
Hey Simone,
Thanks for your thorough reply. Useful!
- Resampling:
I've done some experimenting with Resampling since my email and, at least for the data I was testing, Nearest Neighbour came out best (though obviously entirely subjective). This was for a regular RGB map (Ordnance Survey MiniScale specifically). There was also a significant difference in file sizes, with Nearest Neighbour being the smallest, and Cubic being the largest (double!). I didn't expect such a significant difference, may be worth noting somewhere.
- External Overviews:
Thanks, but I'd found out how to create them with GDAL. My issue is - how do I get GeoServer to use them? Does it just pick them up automatically or do I need some sort of particular directory structure?
GDAL seems to create a single, absurdly huge file, so I guess it doesn't compress the external overviews. Based on what I'm seeing, I'm not sure if FME can create them; I expected multiple separate files (one for each layer), but the "external overviews" are basically identical to internal overviews but minus the source data.
-Overviews:
How do you determine how many levels to create? I remember this page from years ago when last I used GeoServer: http://docs.geoserver.org/stable/en/user/tutorials/imagemosaic-jdbc/imagemosaic-jdbc_tutorial.html#how-many-pyramids-are-needed - does this carry over to GeoTIFF's overviews?
I may be reading this wrong, but the example in the PDF has 7 layers but only really needs about 3 as after that they're smaller than the 1:1 tiles.
Is there any point in creating pyramids with a value of 1 (as in, 1 2 4 8...) ? GDAL lets you do this but if I'm understanding this correctly, they're entirely superfluous (I mention it because Russ's example he generously posted starts at 1).
Tile size:
Might there be a formula for calculating optimal size?
For example, if I have a 94488*157480 pixel GeoTIFF (my largest - about 2.5GB compressed), if I divide them by 512 I get
184*307 tiles = 56,488 tiles
For 256 I get:
368*614 tiles = 225,952 tiles
So if there is/was a formula, it might be easier to figure out what size to go with, or at what point you'd want to be scaling up to ImageMosaic.
Based on disk seeks etc, and some experimentation, I'd guess that a number could be contrived allowing the documentation to say:
"If you end up with > X,000 tiles, go with ImageMosiac" and "Aim for between X,000 and X0,000 tiles in a GeoTIFF"
Just a thought, would be helpful for optimisation.
Documentation:
I was considering contributing to it for this but at this point certainly don't know enough (you may have noticed from my above questions
).
I would however suggest that this section of the help: http://docs.geoserver.org/stable/en/user/production/data.html?highlight=gdaladdo#pick-the-best-performing-coverage-formats - It would gain from the info on pages 7-9 of your PDF.
Cheers!
Jonathan
On 2 January 2013 15:03, Simone Giannecchini <simone.giannecchini@anonymised.com.<mailto:simone.giannecchini@anonymised.com>> wrote:
Ciao Jonathan,
please find my answers inline below...
Regards,
Simone Giannecchini
Ing. Simone Giannecchini
@simogeo
Founder/Director
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 333 8128928
http://www.geo-solutions.it
http://twitter.com/geosolutions_it
-------------------------------------------------------
On Wed, Jan 2, 2013 at 2:01 PM, Jonathan Moules
<jonathanmoules@anonymised.com<mailto:jonathanmoules@anonymised.com>> wrote:
Hi Simone,
I did use pages 7-9 to decide to go for BigTIFF and so 17-19 isn't
applicable as I'm not going to ImageMosaic.
While pages 10-13 are useful, its a slideshow and so lacks details which
were probably included in the spoken part of the presentation. Its the
details that I'm looking for. What tile size is best? What resampling
algorithm? What overview size/scales? Etc.
Don't get me wrong, its an excellent document, but it only provides an
overview so can't substitute for actual detailed documentation.
Your second link answers the tile-size question, but specific for MapServer
(i.e. 256).
Also, page 11 hints at "external overviews" but there's no indication as to
how to create them or use them with GeoServer. FME can create them as I'm
sure GDAL can too, but again, I don't know how to use External Overviews
with Geoserver. A search of the docs for "external overviews" finds nothing
and a google only finds that PDF.
Does anyone have more information on how to use these with GeoServer?
No worries, actually this feedback gives me some hints on how to
update the presentation 
That said, the considerations done for MapServer applies more or less
to GeoServer as they are more related
to how to geotiff works rather than to how MapServer or GeoServer works.
Anyway, my suggestions are as follows:
- tile size-
256 or for very large bigtiff 512 is my preferred size.
Using a tile size that is too small may result in too many seeks
operations on a side and on the other side might result into a TIFF
directory explosion.
The more tiles we have the more info we need to encode in the TIFF
Directory, which is essentially a list of offset locations that tells
us where the data is on disk.
For bigtiff where we have thousands of tiles it can be MB big.
- Resampling -
This depends on the data.
For RGB like data (orthos and the like) I would gor for higher order
interpolation to reduce aliasing (cripsy images).
For data like DEM or other data that contains real values to which you
will want to apply a color map I would for for Nearest neighbor to NOT
introduce artificial values.
In some case you might want to use bilinear of average but this
depends much on what you want to do.
- Overviews -
I usually use steps of 2, as many of them as possible although I
usually don't create overviews smaller than the tile size
- External Overviews -
This works only with pure geotiff (or bigtiff as well as the plugin is
the same). ImageMosaic does not (yet) support that.
External overviews are useful just in case you don't want to touch the
original files or in case they do not support inner overviews (which
is not the case for geotiff).
Creating external overviews can be done with gdaladdo, check this link
http://www.gdal.org/gdaladdo.html
I'm fine with no FME specific documentation and wouldn't really expect
any, but because I can't find things in the generic documentation (i.e., any
of the above but especially external overviews), I can't necessarily
implement an optimal solution with any utility, be it FME or GDAL.
Let me know if this helps. I will try to update those slides as well
as to create some additional docs.
You might even want to contribute some doc once you get to where you
needed to go 
Thanks,
Jonathan
On 2 January 2013 11:25, Simone Giannecchini
<simone.giannecchini@anonymised.com<mailto:simone.giannecchini@anonymised.com>> wrote:
Ciao Jonathan,
I don't know about FME therefore I cannot comment on how to do such
things with it. I hardly believe many people on this list (as well as
on other OS oriented lists) will know it much.
This to say that most intructions you'll find will be oriented towards
OS tools like GDAL utilities rather than towards proprietary solutions
like FME.
That said, the suggestions you are looking for are inside the document
andrea suggested (here the link again http://goo.gl/TXJRS):
- slides 7-9 when do use what
- slides 10-13 how to optimize each single geotiff
- slides 17-19 how to prepare a mosaic
and so on.
If you check on the web you'll also find other docs that compare tile
sizes rather than types of compression, e.g.
http://www.fosslc.org/drupal/content/tuning-gdal-raster-performance
Regards,
Simone Giannecchini
Ing. Simone Giannecchini
@simogeo
Founder/Director
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 333 8128928
http://www.geo-solutions.it
http://twitter.com/geosolutions _it
-------------------------------------------------------
On Thu, Dec 27, 2012 at 2:20 PM, Jonathan Moules
<jonathanmoules@anonymised.com<mailto:jonathanmoules@anonymised.com>> wrote:
> Hi list,
> Following this up, I've decided to go the GeoTIFF/BigTiff route with
> inner
> tiling and overviews based on Andrea's document.
> But unfortunately there doesn't seem to be any documentation on the
> GeoServer pages about this stuff. I've never used GDAL before and
> generally
> avoid the command line (there's a reason we're trying for GeoServer not
> MapServer
) so am not sure where to start.
>
> My source input is hundreds/thousands of LZW compressed GeoTIFF tiles.
> These
> need to be mosaiced together (I can do that in FME easily enough).
> But then what? Do I compress them at this point? Or inner tiling? Or
> Overview? Does it matter what order these things are done?
> And what settings should I use? For instance Andrea's document gives a
> "blocksize" of 512, but Jukka's comments were on the order of 10,000 and
> the
> default is 256.
>
> Is there a tutorial out there on this I've failed to find?
> Thanks,
> Jonathan
>
>
> On 21 December 2012 14:45, Jonathan Moules
> <jonathanmoules@anonymised.com<mailto:jonathanmoules@anonymised.com>> wrote:
>>
>> Thanks Andrea, this looks like a very useful document although it is
>> looking like I'm not going to be able to do this with FME from what I
>> can
>> see.
>> Cheers,
>> Jonathan
>>
>>
>>
>>
>> On 20 December 2012 21:06, Andrea Aime <andrea.aime@anonymised.com<mailto:andrea.aime@anonymised.com>>
>> wrote:
>>>
>>> On Thu, Dec 20, 2012 at 12:16 PM, Jonathan Moules
>>> <jonathanmoules@anonymised.com<mailto:jonathanmoules@anonymised.com>> wrote:
>>>>
>>>> Hi Jukka,
>>>> Thanks for the information.
>>>>
>>>> Relating to the TIFF's, I'd still prefer to use one single tool
>>>> (specifically FME) for tile/image creation. FME can create pyramids,
>>>> but not
>>>> as part of a GeoTIFF. So I can create separate TIFF files and tile
>>>> them if
>>>> necessary too (either internal tiling or regular tiling), but how do
>>>> I get
>>>> these into GeoServer?
>>>>
>>>> Is there a resource out there which describes the different Raster
>>>> serving extensions and the advantages/disadvantages of each?
>>>
>>>
>>> Here:
>>>
>>> http://demo.geo-solutions.it/share/foss4g2011/gs_steroids_sgiannec_foss4g2011.pdf
>>>
>>> Cheers
>>> Andrea
>>>
>>> --
>>> ==
>>> Our support, Your Success! Visit http://opensdi.geo-solutions.it for
>>> more
>>> information.
>>> ==
>>>
>>> Ing. Andrea Aime
>>> @geowolf
>>> Technical Lead
>>>
>>> GeoSolutions S.A.S.
>>> Via Poggio alle Viti 1187
>>> 55054 Massarosa (LU)
>>> Italy
>>> phone: +39 0584 962313
>>> fax: +39 0584 1660272
>>> mob: +39 339 8844549
>>>
>>> http://www.geo-solutions.it
>>> http://twitter.com/geosolutions_it
>>>
>>> -------------------------------------------------------
>>
>>
>
>
>
> This transmission is intended for the named addressee(s) only and may
> contain sensitive or protectively marked material up to RESTRICTED and
> should be handled accordingly. Unless you are the named addressee (or
> authorised to receive it for the addressee) you may not copy or use it,
> or
> disclose it to anyone else. If you have received this transmission in
> error
> please notify the sender immediately. All email traffic sent to or from
> us,
> including without limitation all GCSX traffic, may be subject to
> recording
> and/or monitoring in accordance with relevant legislation.
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET<http://ASP.NET>, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122712
> _____ _________________ _________________________
> Geoserver-users mailing list
> Geoserver-users@lists.sourceforge.net<mailto:Geoserver-users@anonymised.comrceforge.net>
> https://lists.sourceforge.net/lists/listinfo/geoserver-users
>
This transmission is intended for the named addressee(s) only and may
contain sensitive or protectively marked material up to RESTRICTED and
should be handled accordingly. Unless you are the named addressee (or
authorised to receive it for the addressee) you may not copy or use it, or
disclose it to anyone else. If you have received this transmission in error
please notify the sender immediately. All email traffic sent to or from us,
including without limitation all GCSX traffic, may be subject to recording
and/or monitoring in accordance with relevant legislation.
This transmission is intended for the named addressee(s) only and may contain sensitive or protectively marked material up to RESTRICTED and should be handled accordingly. Unless you are the named addressee (or authorised to receive it for the addressee) you may not copy or use it, or disclose it to anyone else. If you have received this transmission in error please notify the sender immediately. All email traffic sent to or from us, including without limitation all GCSX traffic, may be subject to recording and/or monitoring in accordance with relevant legislation.