I have work underway that uses a series of scripts to call mapcalc and crunch really large datasets on different parts of the world. I can run the scripts in separate instances of grass.
My understanding is that grass is designed to read and write a lot to the hard drive so that it can handle large data sets. So my question is would it faster to get two 250 gig SATA drives with one region of the world on each or one one 500 gig drive with both data sets?
Advice appreciated!
Jerry
Gerald Nelson
Professor, Dept. of Agricultural and Consumer Economics
University of Illinois, Urbana-Champaign
office: 217-333-6465
cell: 217-390-7888
315 Mumford Hall
1301 W. Gregory
Urbana, IL 61801
Gerald Nelson wrote:
I have work underway that uses a series of scripts to call mapcalc and
crunch really large datasets on different parts of the world. I can
run the scripts in separate instances of grass.
My understanding is that grass is designed to read and write a lot to
the hard drive so that it can handle large data sets. So my question
is would it faster to get two 250 gig SATA drives with one region of
the world on each or one one 500 gig drive with both data sets?
I'm not sure whether it would make any difference. Unless the
processing is particularly simple and you're storing maps
uncompressed, the processing will be the bottleneck, rather than disk
I/O.
Also, larger drives often have more platters and thus a a higher
sustained transfer rate.
Finally, if you have multiple drives, you can configure them as a
RAID-0 array so that they appear to be a single large drive, with
blocks alternated between the physical drives. This means that you
don't need to be accessing both datasets concurrently to get the
performance benefit.
--
Glynn Clements <glynn@gclements.plus.com>