Hi all,
I agree that CSV is a poor choice for datasets that are larger, or for high traffic server, the text nature of the format is going to ruin the party anyways (large size, slow parsing, no indexing).
That said I wanted to check the situation. I don’t see the CSV store anywhere in the stable series, not as core, nor as an extension. It is however included when installing the WPS and importer extensions, as a way to parse and encode CSV files. Two legit cases, where the CSV is processed one time and turned into something else.
The inclusion has the side effect of having the CSV store show up also in the “new data store” page. The experience configuring it is indeed not great:
The strategy bit could be improved… it’s an enumerated field with possible values guess, AttributesOnly, specify, wkt.
When not specified the default is “AttributesOnly”, which is not going to guess any geometry type. “guess” may seem a better default, it will try to guess which columns are lat and lon (but won’t try to guess a wkt approach).
For the small data set I’ve chosen, “guess” will work:
LAT, LON, CITY, NUMBER, YEAR
46.066667, 11.116667, Trento, 140, 2002
44.9441, -93.0852, St Paul, 125, 2003
13.752222, 100.493889, Bangkok, 150, 2004
45.420833, -75.69, Ottawa, 200, 2004
44.9801, -93.251867, Minneapolis, 350, 2005
46.519833, 6.6335, Lausanne, 560, 2006
48.428611, -123.365556, Victoria, 721, 2007
-33.925278, 18.423889, Cape Town, 550, 2008
-33.859972, 151.211111, Sydney, 436, 2009
My quick take: for a store that shows in the UI, we need something a bit better.
Ideas:
- Strategies should be enumerated, e.g. this in the store already helps a bit:
public static final Param STRATEGYP = new Param("strategy", String.class, "strategy", false, GUESS_STRATEGY, new KVP(Param.OPTIONS, new ArrayList<>(List.of(GUESS_STRATEGY, ATTRIBUTES_ONLY_STRATEGY, SPECIFC_STRATEGY, WKT_STRATEGY))));
- Strategy names should be a bit easier to understand (specify … what?)
- Field description should be a bit more consistent (the strategy name in the wkt longField description is not visible in the drop down):
- “guess” should probably make an effort to guess a WKT field too
- The write-only settings should probably be left as last (writeprj, quoteAll)
Once that is done, we’d have a store that could be used as an extension if needed, with a documentation clearly specifying its performance limits…
Cheers
Andrea