Hello List,
We are implementing a Geoserver based application where we need to import bulky CSV data into Oracle. We think the importer is perfect for this, since it can handle CSV and can write to a OCI-based oracle datastore. However, before we go further we have been doing some evaluation.
The System specs is as follows
- OS:RH 6.5
- Db:Oracle 12x
- Geoserver Vers: 2.8.1
- Data size: 23 million records (1.5GB, but could be upto 5GB)
From our initial evaluations we have the following questions regarding the inner workings oft he importer:
- CSV Data:
- What separater does the importer use by default?
- Is the separator character configurable? We noticed that it only accepts comma as a separator
- Does quoting or not quoting column values have an effect on the importer ability to parse (from our experiment – NO)
- File upload:
- When uploading the CSV, is the data compressed by default, and if so which compression? GZIP?
- If the data is not compressed by default, can we compress it on the client side (say using Curl)
- If b above is correct, how does the server handle a compressed stream? Does it check if the incoming data is compressed?
- Database writes:
- Do writes begins immediately as soon as the first row oft he CSV data is read on the server side? Or does the importer waits till the entire CSV is uploaded before starting to write into the database
- Is there a way we can optimize the writes? Parallelize it?
- What settings in geoserver affects the execution of the Importer during the write operations
- What can we do on the database side in order to make the process faster? (22 million rows took 3 days to write)
- Monitoring:
- The progress interface reports only the number of rows inserted, the toal rows and the task status. Are the the only things that can be shown on a progress report?
- Is it possible also to retrieve info about:
- Start time/date
- End time/date
- Date of last successful import etc
- Others:
- Are we using the importer fort he right task (uploading bulk data)?
- Was importer envisioned for such use cases?
- Is there another known working alternative within Geoserver realm?
Thanks and regards,
Moses
T-Systems International GmbH
Telekom IT
System Integration Telekom IT
E-TSOEW0103
Moses Gone
+49 228 98413510 (Tel.)
E-Mail: moses.gone@anonymised.com
T-Systems International GmbH
Aufsichtsrat: Thomas Dannenfeldt (Vorsitzender)
Geschäftsführung: Reinhard Clemens (Vorsitzender), Dr. Ferri Abolhassan, Thilo Kusch, Dr. Markus Müller, Georg Pepping, Hagen Rickmann
Handelsregister: Amtsgericht Frankfurt am Main HRB 55933
Sitz der Gesellschaft: Frankfurt am Main
WEEE-Reg.-Nr. DE50335567
Hinweis: Diese E-Mail und/oder die Anhänge sind vertraulich und ausschließlich für den bezeichneten Adressaten bestimmt. Die Weitergabe oder Kopieren dieser E-Mail ist strengstens verboten. Wenn Sie diese E-Mail irrtümlich erhalten haben, informieren Sie bitte unverzüglich den Absender und vernichten Sie die Nachricht und alle Anhänge. Vielen Dank.
Große Veränderungen fangen klein an – Ressourcen schonen und nicht jede E-Mail drucken