[GRASS-dev] trying to amend d.vect.chart - need for advice

Hello,

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a new db cursor to fetch the column information. On a map with a fair amount of features (20464 centroids) with the table in Postgresql it seems that the db connection is what takes the most of the time (even worse obviously when the map is linked to a view which needs to be recalculated for every feature).

So currently, the program's logic is as follows (in display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with Vect_get_num_lines() which should return number of lines, not number of features - as you can see I'm very new to the vector library)
- loop through each feature:
    - get cat of feature
    - open cursor selecting columns [and sizecol] for this feature according to cat
    - close cursor
    - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the where option
- loop through the cursor:
    - find x,y values according to cat
    - plot with this info
- close cursor

I have two main questions about this:

1) Does this sound reasonable ? Anything we are missing ?
2) Is there a function to get x,y point values according to cat value ?

Thanks,

Moritz

I know I may be asking a lot. But if you are successful with this project,
what do you think about turning d.vect.thematic into a C-code module too?
Many of the functions might be similar to those in d.vect.chart.

Michael
__________________________________________
Michael Barton, Professor of Anthropology
School of Human Evolution & Social Change
Center for Social Dynamics and Complexity
Arizona State University

phone: 480-965-6213
fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Date: Wed, 06 Sep 2006 14:17:23 +0200
To: Grass Developers List <grass-dev@grass.itc.it>
Subject: [GRASS-dev] trying to amend d.vect.chart - need for advice

Hello,

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a
new db cursor to fetch the column information. On a map with a fair
amount of features (20464 centroids) with the table in Postgresql it
seems that the db connection is what takes the most of the time (even
worse obviously when the map is linked to a view which needs to be
recalculated for every feature).

So currently, the program's logic is as follows (in
display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with
Vect_get_num_lines() which should return number of lines, not number of
features - as you can see I'm very new to the vector library)
- loop through each feature:
    - get cat of feature
    - open cursor selecting columns [and sizecol] for this feature
according to cat
    - close cursor
    - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the
where option
- loop through the cursor:
    - find x,y values according to cat
    - plot with this info
- close cursor

I have two main questions about this:

1) Does this sound reasonable ? Anything we are missing ?
2) Is there a function to get x,y point values according to cat value ?

Thanks,

Moritz

On Wed, September 6, 2006 21:24, Michael Barton wrote:

I know I may be asking a lot. But if you are successful with this project,
what do you think about turning d.vect.thematic into a C-code module too?
Many of the functions might be similar to those in d.vect.chart.

In the discussions about how to amend d.vect.chart, we have actually come
to the conclusion that what you suggest is what we need ideally. It just
is a larger project, so we might start with amending d.vect.chart.

The biggest problem is actually to learn the vector library. I have to
admit that I don't find the programmer's manual the most intuitive to find
what I am looking for.

Moritz

Michael
__________________________________________
Michael Barton, Professor of Anthropology
School of Human Evolution & Social Change
Center for Social Dynamics and Complexity
Arizona State University

phone: 480-965-6213
fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Date: Wed, 06 Sep 2006 14:17:23 +0200
To: Grass Developers List <grass-dev@grass.itc.it>
Subject: [GRASS-dev] trying to amend d.vect.chart - need for advice

Hello,

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a
new db cursor to fetch the column information. On a map with a fair
amount of features (20464 centroids) with the table in Postgresql it
seems that the db connection is what takes the most of the time (even
worse obviously when the map is linked to a view which needs to be
recalculated for every feature).

So currently, the program's logic is as follows (in
display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with
Vect_get_num_lines() which should return number of lines, not number of
features - as you can see I'm very new to the vector library)
- loop through each feature:
    - get cat of feature
    - open cursor selecting columns [and sizecol] for this feature
according to cat
    - close cursor
    - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the
where option
- loop through the cursor:
    - find x,y values according to cat
    - plot with this info
- close cursor

I have two main questions about this:

1) Does this sound reasonable ? Anything we are missing ?
2) Is there a function to get x,y point values according to cat value ?

Thanks,

Moritz

On Wed, Sep 06, 2006 at 09:44:14PM +0200, Moritz Lennert wrote:
...

The biggest problem is actually to learn the vector library. I have to
admit that I don't find the programmer's manual the most intuitive to find
what I am looking for.

Hi Moritz,

... but what are you looking for? :slight_smile:
To make it better, we need concrete questions to be asked
which could then be answered in the manual.

Don't hesiate to post it here, or better, do that
in the Wiki. Later we can transfer that into the
doxygen pages.

If we through knowledge together, we may get a reasonable
progman soon.

Markus

Moritz Lennert wrote:

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a
new db cursor to fetch the column information. On a map with a fair
amount of features (20464 centroids) with the table in Postgresql it
seems that the db connection is what takes the most of the time (even
worse obviously when the map is linked to a view which needs to be
recalculated for every feature).

So currently, the program's logic is as follows (in
display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with

Vect_get_num_lines() which should return number of lines, not number
of features - as you can see I'm very new to the vector library)
- loop through each feature:
    - get cat of feature
    - open cursor selecting columns [and sizecol] for this feature
according to cat
    - close cursor
    - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the
where option
- loop through the cursor:
    - find x,y values according to cat
    - plot with this info
- close cursor

plotting to the screen is probably the slowest part; having that inside
the loop is probably a slow down .. as a prototype could you write a
shell script that runs v.extract + d.vect.chart?

But it shouldn't be to hard to do as you suggest, and I imagine if you
leave the actual close driver/stabilize call until after the loop it
shouldn't be much slower.

Hamish

Markus Neteler wrote:

On Wed, Sep 06, 2006 at 09:44:14PM +0200, Moritz Lennert wrote:
...

The biggest problem is actually to learn the vector library. I have to
admit that I don't find the programmer's manual the most intuitive to find
what I am looking for.

Hi Moritz,

... but what are you looking for? :slight_smile:
To make it better, we need concrete questions to be asked
which could then be answered in the manual.

Just as an example: I was trying to find the definition of the Map_info
structure. It's not in data structures...

In its current form, you need to know in which file a function is and
where this file is in the directory structure to easily find it.
... well I just saw that if you go into "Globals" you have an index, I
have to admit that I never thought of looking there... (but still no
Map_info structure definition)

Don't hesiate to post it here, or better, do that
in the Wiki.

Where do want me to post development questions in the Wiki ?

Later we can transfer that into the
doxygen pages.

If we through knowledge together, we may get a reasonable
progman soon.

I guess, I'll just have to find that time to really dig into it more and
get used to GRASS' library structure.

Moritz

Hamish wrote:

Moritz Lennert wrote:

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a new db cursor to fetch the column information. On a map with a fair amount of features (20464 centroids) with the table in Postgresql it seems that the db connection is what takes the most of the time (even worse obviously when the map is linked to a view which needs to be recalculated for every feature).

So currently, the program's logic is as follows (in display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with

Vect_get_num_lines() which should return number of lines, not number
of features - as you can see I'm very new to the vector library)
- loop through each feature:
    - get cat of feature
    - open cursor selecting columns [and sizecol] for this feature according to cat
    - close cursor
    - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the where option
- loop through the cursor:
    - find x,y values according to cat
    - plot with this info
- close cursor

plotting to the screen is probably the slowest part; having that inside
the loop is probably a slow down ..

Yes, but unless you create a temporary map with the charts as objects, I don't know how you could not plot every chart separately.

as a prototype could you write a
shell script that runs v.extract + d.vect.chart?

This would show an implementation of a where clause, but would not change the creation of a new cursor for every chart. But I can do that.

But it shouldn't be to hard to do as you suggest, and I imagine if you
leave the actual close driver/stabilize call until after the loop it
shouldn't be much slower.

Any hints as to a function to get x,y point values according to cat value ?

Moritz

After an offline discussion with Hamish, I come back to the list, with a revised proposal and still some need for advice.

As a reminder, here's the issue:

Moritz Lennert wrote:

Hello,

With a colleague we are trying to amend d.vect.chart in order to

1) speed it up
2) allow a where clause

Currently, d.vect.chart loops through each vector feature and opens a new db cursor to fetch the column information. On a map with a fair amount of features (20464 centroids) with the table in Postgresql it seems that the db connection is what takes the most of the time (even worse obviously when the map is linked to a view which needs to be recalculated for every feature).

So currently, the program's logic is as follows (in display/d.vect.chart/plot.c):

- get number of features (little aside question: why is this done with Vect_get_num_lines() which should return number of lines, not number of features - as you can see I'm very new to the vector library)
- loop through each feature:
   - get cat of feature
   - open cursor selecting columns [and sizecol] for this feature according to cat
   - close cursor
   - plot with this info

We would like to modify this according to the following logic:

- add a 'where' option
- open a cursor selecting cat, columns [and sizecol] limited by the where option
- loop through the cursor:
   - find x,y values according to cat
   - plot with this info
- close cursor

Now, after discussion, there seem to be the following issues with our solution:

- If a map is linked to a table containing a lot of lines not linked to an object in the map we have to loop through all these lines, with the search for x,y failing. This might slow things down again. One path towards a solution to this problem would be to get the list of cats from the map and include them in the where clause of the cursor select statement, but if there are a lot of objects this would be a _very_ long where clause which is probably not feasible.

- A given cat value may correspond to several objects, and thus to several x,y values. For chart objects, it should probably be considered a bug that the same chart is drawn several times on the map. If you want to see a country's total population, you don't want to see the same circle repeated on every island belonging to this country. But since we cannot automatically tell which object is the "mainland", this has to be left to the user. But at least we should include a warning before painting several times the same chart, so that the user is made aware of the issue and can clean up the map of needed.

- The above only works if we have a function that allows us to find the x,y value on the basis of a given cat value. I still haven't been able to find such a function. Does it exist ?

Thanks to anyone who can point me in the right direction.

Moritz