[GRASSLIST:3171] advice on building a workstation PC for Grass

Hi,
I'd appreciate any advice on building a new PC, with running Grass in mind. I am currently using
AMD Duron Processor
cpu MHz : 1294.521
RAM : 750 MB
Linux 2.4.20-28.8
The ram seems to be fine, but the CPU frequently max's out on things such as v.in.shape, r.patch, etc.
I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and case. I know I am being quite vague about what sort of data I'm working with, but I'm sure that people must have opinions about what a dedicated Grass workstation would have in it.
Thanks,
Phil

For raster processing, faster disks seem to be as important as anything.
If possible, go with serial ATA drives, and if you can afford it, a RAID
setup. Gobs of RAM also helps, as extra goes to disk caching, which speeds
you up a lot. Perhaps someone who knows more about the specific computing
requirements of GRASS could give a little more direction concerning CPU
choice.

Dave

On Wed, 14 Apr 2004, Philipp Molzer wrote:

Hi,
I'd appreciate any advice on building a new PC, with running Grass in
mind. I am currently using
AMD Duron Processor
cpu MHz : 1294.521
RAM : 750 MB
Linux 2.4.20-28.8
The ram seems to be fine, but the CPU frequently max's out on things
such as v.in.shape, r.patch, etc.
I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and
case. I know I am being quite vague about what sort of data I'm working
with, but I'm sure that people must have opinions about what a dedicated
Grass workstation would have in it.
Thanks,
Phil

--
Dave

Philipp Molzer wrote:

I'd appreciate any advice on building a new PC, with running Grass in
mind. I am currently using
AMD Duron Processor
cpu MHz : 1294.521
RAM : 750 MB
Linux 2.4.20-28.8
The ram seems to be fine, but the CPU frequently max's out on things
such as v.in.shape, r.patch, etc.
I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and
case. I know I am being quite vague about what sort of data I'm working
with, but I'm sure that people must have opinions about what a dedicated
Grass workstation would have in it.

It depends on the nature of the workload. If you are doing simple,
serial-access computations on large amounts of data, disk bandwidth
will be the limiting factor. For computations which involve
random-access, RAM is significant. For intensive computations on small
amounts of data, the CPU speed will be the limiting factor. If you
read certain maps repeatedly, having sufficient RAM to cache them will
improve the situation.

Except in extreme cases where the answer would be obvious, it's almost
impossible to predict which will be the limiting factor. You just have
to observe the behaviour (CPU/memory/disk usage) of an existing
system.

--
Glynn Clements <glynn.clements@virgin.net>

On Apr 14, 2004, at 10:45 PM, Glynn Clements wrote:

Philipp Molzer wrote:

I'd appreciate any advice on building a new PC, with running Grass in
mind. I am currently using
AMD Duron Processor
cpu MHz : 1294.521
RAM : 750 MB
Linux 2.4.20-28.8
The ram seems to be fine, but the CPU frequently max's out on things
such as v.in.shape, r.patch, etc.
I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and
case. I know I am being quite vague about what sort of data I'm working
with, but I'm sure that people must have opinions about what a dedicated
Grass workstation would have in it.

It depends on the nature of the workload. If you are doing simple,
serial-access computations on large amounts of data, disk bandwidth
will be the limiting factor. For computations which involve
random-access, RAM is significant. For intensive computations on small
amounts of data, the CPU speed will be the limiting factor. If you
read certain maps repeatedly, having sufficient RAM to cache them will
improve the situation.

Except in extreme cases where the answer would be obvious, it's almost
impossible to predict which will be the limiting factor. You just have
to observe the behaviour (CPU/memory/disk usage) of an existing
system.

The General Rules of Computing Machinery seem to apply:
1) Every program expands to fill all available memory.
1a) Every database expands to fill all available disks.
2) With respect to RAM, disk space, and CPU speed: some is good, more is better, and too much is just enough!
3) Whatever you build will be obsolete in 18 months.

RAID, 64-bit CPU, 2GB RAM, and two or three big honkin' monitors.

Jim Plante
<jimplante@charter.net>

cr.yp.to's site has great advice. He reccomends going with a dual-cpu. I'm wondering if any of the GRASS components are written to be multi-threaded, to take advantage of such a machine. For example, my underpowered machine has been r.patch'ing together 8 60MB rasters for the last 5 hours, and is about 50% done. Would a dual-cpu machine get it done twice as fast? While it is using 95% of my CPU, it's only using 9% of memory. Phil

drew einhorn wrote:

Over the years I've found the advice at this web page
quite useful:

http://cr.yp.to/hardware/advice.html

I'd probably bump the memory to 1 GB.

On Wed, 2004-04-14 at 16:00, Philipp Molzer wrote:

Hi,
I'd appreciate any advice on building a new PC, with running Grass in mind. I am currently using
AMD Duron Processor
cpu MHz : 1294.521
RAM : 750 MB
Linux 2.4.20-28.8
The ram seems to be fine, but the CPU frequently max's out on things such as v.in.shape, r.patch, etc.
I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and case. I know I am being quite vague about what sort of data I'm working with, but I'm sure that people must have opinions about what a dedicated Grass workstation would have in it.
Thanks,
Phil
   

Ah, last time I looked closely he was still recommending
single processor boxes.

Even if it's not multithreaded you may be able to manually
split the problem in two. Instead of running it on one big
region, maybe you can split it into a north half and a south
half. And r.patch them concurently.

Depends on whether there's a cheap way to combine two regions.
I'm just a beginner so I don't know.

On Thu, 2004-04-15 at 05:21, Philipp Molzer wrote:

cr.yp.to's site has great advice. He reccomends going with a dual-cpu.
I'm wondering if any of the GRASS components are written to be
multi-threaded, to take advantage of such a machine. For example, my
underpowered machine has been r.patch'ing together 8 60MB rasters for
the last 5 hours, and is about 50% done. Would a dual-cpu machine get
it done twice as fast? While it is using 95% of my CPU, it's only
using 9% of memory.
Phil

drew einhorn wrote:

>Over the years I've found the advice at this web page
>quite useful:
>
>http://cr.yp.to/hardware/advice.html
>
>I'd probably bump the memory to 1 GB.
>
>On Wed, 2004-04-14 at 16:00, Philipp Molzer wrote:
>
>
>>Hi,
>>I'd appreciate any advice on building a new PC, with running Grass in
>>mind. I am currently using
>>AMD Duron Processor
>>cpu MHz : 1294.521
>>RAM : 750 MB
>>Linux 2.4.20-28.8
>>The ram seems to be fine, but the CPU frequently max's out on things
>>such as v.in.shape, r.patch, etc.
>>I'd like to stay under $600 for motherboard, CPU, RAM, harddrive, and
>>case. I know I am being quite vague about what sort of data I'm working
>>with, but I'm sure that people must have opinions about what a dedicated
>>Grass workstation would have in it.
>>Thanks,
>>Phil
>>
>>

--
drew einhorn <drew@technteach.com>
Technology and Teaching

Philipp Molzer wrote:

cr.yp.to's site has great advice. He reccomends going with a dual-cpu.
I'm wondering if any of the GRASS components are written to be
multi-threaded, to take advantage of such a machine.

No. If you want parallelism, you have to implement it yourself (i.e.
run multiple sessions).

For example, my
underpowered machine has been r.patch'ing together 8 60MB rasters for
the last 5 hours, and is about 50% done.

That suggests a fundamental flaw in r.patch (or maybe the underlying
libraries). Patching rasters should have almost zero computational
overhead; it should be almost entirely I/O.

Someone should look into this (i.e. build GRASS with profiling
information and figure out where all the CPU time is going).

Unless you have a really slow CPU, or all of the maps are in RAM, it
should be possible to patch rasters together as fast as they can be
read from disk.

Unfortunately, there are some aspects of r.patch which could introduce
significant inefficiencies. Specifically, the body of the per-column
loop in do_patch() could take an order of magnitude (or more) longer
than it needs to. Functions such as G_is_null_value, G_raster_cpy etc
could potentially take dozens of clock cycles, when the underlying
operations (comparing/copying machine words) might only take a single
clock cycle.

Also, I don't know what the time complexity of G_update_cell_stats()
is; that could potentially dwarf the rest of the loop.

Would a dual-cpu machine get it done twice as fast?

No.

--
Glynn Clements <glynn.clements@virgin.net>