[GRASS5] d.vect.area

Hi all,

I tested d.vect.area on a very moderate (589 area, no islands
AFAIK) vector file, but even on such a file the speed difference was
significant in comparison to d.area: 3 seconds for the former
compared to 5.5 seconds for the latter.

I was just wondering, Eric, why you decided to implement
d.vect.area with a cat-rgb file option, and not with the catnum
option that was in d.area. This means that for scripts like the
d.area.class I posted last week, I will have to go through a
temporary file which will hamper perfomance again... I guess all
this will not be a problem with the new vector format, but for the
time being, I liked the catnum option.

Moritz

On Thu, 31 Jan 2002 14:11:32 +0100, "M Lennert" <fa1079@qmul.ac.uk> wrote:

Hi all,

I tested d.vect.area on a very moderate (589 area, no islands
AFAIK) vector file, but even on such a file the speed difference was
significant in comparison to d.area: 3 seconds for the former
compared to 5.5 seconds for the latter.

I was just wondering, Eric, why you decided to implement
d.vect.area with a cat-rgb file option, and not with the catnum
option that was in d.area. This means that for scripts like the
d.area.class I posted last week, I will have to go through a
temporary file which will hamper perfomance again... I guess all
this will not be a problem with the new vector format, but for the
time being, I liked the catnum option.

Thought it'd be easier to figure out a color scheme once, and then
reuse it. Also, it's a bit faster to approach it that way if you
want to draw multicolored (only have to go through the entire
dataset once). And it does use category number (not the label).

I, personally, wanted to be able to save a color scheme after
deciding on it, and be able to easily reuse it. That's the main
reason (selfish, I know). I might be talked into the "catnum"
argument, but that approach isn't very amenable to more than
a few categories (there's a limit on command line argument
length). Also, I'd think about having your script generate a
"legend" file that the user could save for reuse. Besides,
the performance hit of generating even a temporary file is
probably insubstantial compared to iterating through the
dataset multiple times to draw each color. Color lookups
in d.vect.area should be pretty fast even for a very
large number of categories (unlike the raster color lookups,
I'm using a balanced tree which scales well).

I was also thinking about creating some palettes for the
"random" color option (and perhaps making it really random).
The standard GRASS colors are fairly boring.

--
Eric G. Miller <egm2@jps.net>

M Lennert wrote:

I was just wondering, Eric, why you decided to implement
d.vect.area with a cat-rgb file option, and not with the catnum
option that was in d.area. This means that for scripts like the
d.area.class I posted last week, I will have to go through a
temporary file which will hamper perfomance again... I guess all
this will not be a problem with the new vector format, but for the
time being, I liked the catnum option.

d.area's "catnum=" option just takes a list of categories, not
category/colour pairs.

Also, there's a limit on the maximum length of a command line (or,
more usually, of the combined size of the command line and the
environment list). Passing the colours via the command line would
limit the maximum number of categories which could be coloured.

Ideally, vector layers should probably have associated colour tables,
as is the case for raster layers.

As for efficiency, a version of d.area.class based upon d.vect.area
would only need to create the legend file and call d.vect.area once,
rather than having to call d.area once per colour.

--
Glynn Clements <glynn.clements@virgin.net>

From: Glynn Clements <glynn.clements@virgin.net>
Date sent: Thu, 31 Jan 2002 16:03:39 +0000
To: "M Lennert" <fa1079@qmul.ac.uk>
Copies to: grass5@grass.itc.it
Subject: Re: [GRASS5] d.vect.area

M Lennert wrote:

> I was just wondering, Eric, why you decided to implement
> d.vect.area with a cat-rgb file option, and not with the catnum
> option that was in d.area. This means that for scripts like the
> d.area.class I posted last week, I will have to go through a
> temporary file which will hamper perfomance again... I guess all
> this will not be a problem with the new vector format, but for the
> time being, I liked the catnum option.

d.area's "catnum=" option just takes a list of categories, not
category/colour pairs.

Well it does in the sense that you can list all the categories you want to plot and the color you
want to plot them in...

Also, there's a limit on the maximum length of a command line (or,
more usually, of the combined size of the command line and the
environment list). Passing the colours via the command line would
limit the maximum number of categories which could be coloured.

I guess my own test examples were just too limited to explode the command line...as I said I
only tried with 589 areas.

Ideally, vector layers should probably have associated colour tables,
as is the case for raster layers.

Well, this will be addressed by the new vector format, won't it ?

As for efficiency, a version of d.area.class based upon d.vect.area
would only need to create the legend file and call d.vect.area once,
rather than having to call d.area once per colour.

I'm quite inexperienced in programming, but doesn't the writing to a file take much more time
than calling a module ?

Thanks for your response !

Moritz

M Lennert wrote:

> > I was just wondering, Eric, why you decided to implement
> > d.vect.area with a cat-rgb file option, and not with the catnum
> > option that was in d.area. This means that for scripts like the
> > d.area.class I posted last week, I will have to go through a
> > temporary file which will hamper perfomance again... I guess all
> > this will not be a problem with the new vector format, but for the
> > time being, I liked the catnum option.
>
> d.area's "catnum=" option just takes a list of categories, not
> category/colour pairs.

Well it does in the sense that you can list all the categories you want to plot and the color you
want to plot them in...

"catnum=" takes a list of categories; "fillcolor=" takes a singles
colour. d.area fills a set of categories in a single colour.
d.vect.area fills each category in a (potentially) different colour.

> Also, there's a limit on the maximum length of a command line (or,
> more usually, of the combined size of the command line and the
> environment list). Passing the colours via the command line would
> limit the maximum number of categories which could be coloured.

I guess my own test examples were just too limited to explode the command line...as I said I
only tried with 589 areas.

Is that one call to d.area with 589 categories or 589 calls to d.area,
each with a single category?

Assuming ARG_MAX == 4096 (typical), the former would allow less than 7
characters per category (7 * 589 = 4123). That is just sufficient if
each category is just an integer, but you wouldn't be able to have 589
number/colour pairs.

> Ideally, vector layers should probably have associated colour tables,
> as is the case for raster layers.

Well, this will be addressed by the new vector format, won't it ?

I would hope so.

> As for efficiency, a version of d.area.class based upon d.vect.area
> would only need to create the legend file and call d.vect.area once,
> rather than having to call d.area once per colour.

I'm quite inexperienced in programming, but doesn't the writing to a file take much more time
than calling a module ?

Spawning a command once could potentially take more time than writing
a file (particularly on Cygwin). Both fork() and execve() do quite a
lot of work; program startup (loading shared libraries, relocation)
can also be quite expensive.

Spawning a command multiple times will definitely take more time.

--
Glynn Clements <glynn.clements@virgin.net>

From: Glynn Clements <glynn.clements@virgin.net>
Date sent: Thu, 31 Jan 2002 18:37:28 +0000
To: "M Lennert" <fa1079@qmul.ac.uk>
Copies to: grass5@grass.itc.it
Subject: Re: [GRASS5] d.vect.area

M Lennert wrote:

> > > I was just wondering, Eric, why you decided to implement
> > > d.vect.area with a cat-rgb file option, and not with the catnum
> > > option that was in d.area. This means that for scripts like the
> > > d.area.class I posted last week, I will have to go through a
> > > temporary file which will hamper perfomance again... I guess all
> > > this will not be a problem with the new vector format, but for the
> > > time being, I liked the catnum option.
> >
> > d.area's "catnum=" option just takes a list of categories, not
> > category/colour pairs.
>
> Well it does in the sense that you can list all the categories you want to plot and the color you
> want to plot them in...

"catnum=" takes a list of categories; "fillcolor=" takes a singles
colour. d.area fills a set of categories in a single colour.
d.vect.area fills each category in a (potentially) different colour.

> > Also, there's a limit on the maximum length of a command line (or,
> > more usually, of the combined size of the command line and the
> > environment list). Passing the colours via the command line would
> > limit the maximum number of categories which could be coloured.
>
> I guess my own test examples were just too limited to explode the command line...as I said I
> only tried with 589 areas.

Is that one call to d.area with 589 categories or 589 calls to d.area,
each with a single category?

Assuming ARG_MAX == 4096 (typical), the former would allow less than 7
characters per category (7 * 589 = 4123). That is just sufficient if
each category is just an integer, but you wouldn't be able to have 589
number/colour pairs.

Ok, I understand. My categories were five digit integers, so that was alright even doing one call
to d.area with 589 categories in one color.

> > As for efficiency, a version of d.area.class based upon d.vect.area
> > would only need to create the legend file and call d.vect.area once,
> > rather than having to call d.area once per colour.
>
> I'm quite inexperienced in programming, but doesn't the writing to a file take much more time
> than calling a module ?

Spawning a command once could potentially take more time than writing
a file (particularly on Cygwin). Both fork() and execve() do quite a
lot of work; program startup (loading shared libraries, relocation)
can also be quite expensive.

Spawning a command multiple times will definitely take more time.

Ok, well I'll reimplement d.area.class for use with d.vect.area...

Thanks for the insights!

Moritz