[GRASS5] d.vect.thematic wish: handle null values

Hi,

Trying to make maps with d.vect.thematic I have stumbled upon the problem that
the module fails when there are null values in the table (postgresql):

**********
awk: (FILENAME=- FNR=1) fatal: division by zero attempted

Thematic map legend for column revmed of map ssbxl01

Value range: - 33079
Mapped by 7 intervals of

Color(R:G:B) Value
============ ==========
0:0:250 7 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=
ERREUR: L'opérateur n'existe pas : integer <=
HINT: Aucun opérateur correspond au nom donné et aux types d'arguments. Vous
devez ajouter des conversions explicites de type.

35:0:215 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed > and revmed<=
ERREUR: syntax error sur ou près de «and» at character 41

etc...
************

Now I can see two theoretical solutions to this:

1) handle null values
2) allow to give a value that represents null values (e.g. -99999)

For both of these solutions, d.vect.thematic should offer the choice of a
color for these values, so that they do not use the same color scheme as the
'valid' values.

Moritz

I'm not sure that nulls are the problem. It won't accept the <= operator.
Try straight > and < and see what happens. Also, what is FNR=1?

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Thu, 11 Aug 2005 23:20:00 +0200 (CEST)
To: Grass Developers List <grass5@grass.itc.it>
Cc: <michael.barton@asu.edu>, <dcalvelo@minag.gob.pe>
Subject: d.vect.thematic wish: handle null values

Hi,

Trying to make maps with d.vect.thematic I have stumbled upon the problem that
the module fails when there are null values in the table (postgresql):

**********
awk: (FILENAME=- FNR=1) fatal: division by zero attempted

Thematic map legend for column revmed of map ssbxl01

Value range: - 33079
Mapped by 7 intervals of

Color(R:G:B) Value
============ ==========
0:0:250 7 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=
ERREUR: L'opérateur n'existe pas : integer <=
HINT: Aucun opérateur correspond au nom donné et aux types d'arguments. Vous
devez ajouter des conversions explicites de type.

35:0:215 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed > and revmed<=
ERREUR: syntax error sur ou près de «and» at character 41

etc...
************

Now I can see two theoretical solutions to this:

1) handle null values
2) allow to give a value that represents null values (e.g. -99999)

For both of these solutions, d.vect.thematic should offer the choice of a
color for these values, so that they do not use the same color scheme as the
'valid' values.

Moritz

On Fri, August 12, 2005 0:13, Michael Barton said:

I'm not sure that nulls are the problem. It won't accept the <= operator.
Try straight > and < and see what happens. Also, what is FNR=1?

The operators are not defined by me but by d.vect.thematic.
The problem is not the <= operator, but the fact that in the sql statement

SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=

there should be an integer after the <=.

This integer is missing because the awk script tries to divide by zero because
of the null values.

(FNR is an internal awk variable which stands for the "input record number in
the current input file")

I can confirm that it is the null values that are the problem since if I
replace them with any integer value (0, -99999, etc) d.vect.thematic runs
perfectly, but it creates the classes including the null-replacing integer
since it doesn't know that they are special values, and there is no way of
telling it this.

This is why I plead for one of the two solutions below...

Hope this helps clarifying the issue.

All the best,

Moritz

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Thu, 11 Aug 2005 23:20:00 +0200 (CEST)
To: Grass Developers List <grass5@grass.itc.it>
Cc: <michael.barton@asu.edu>, <dcalvelo@minag.gob.pe>
Subject: d.vect.thematic wish: handle null values

Hi,

Trying to make maps with d.vect.thematic I have stumbled upon the problem
that
the module fails when there are null values in the table (postgresql):

**********
awk: (FILENAME=- FNR=1) fatal: division by zero attempted

Thematic map legend for column revmed of map ssbxl01

Value range: - 33079
Mapped by 7 intervals of

Color(R:G:B) Value
============ ==========
0:0:250 7 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=
ERREUR: L'opérateur n'existe pas : integer <=
HINT: Aucun opérateur correspond au nom donné et aux types d'arguments.
Vous
devez ajouter des conversions explicites de type.

35:0:215 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed > and revmed<=
ERREUR: syntax error sur ou près de «and» at character 41

etc...
************

Now I can see two theoretical solutions to this:

1) handle null values
2) allow to give a value that represents null values (e.g. -99999)

For both of these solutions, d.vect.thematic should offer the choice of a
color for these values, so that they do not use the same color scheme as the
'valid' values.

Moritz

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

Thanks for the clarification Moritz. Is this in the custom breaks section?
This was written by Daniel Calvelo. I'm copying him in case he has any
ideas.

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Fri, 12 Aug 2005 00:53:33 +0200 (CEST)
To: Michael Barton <michael.barton@asu.edu>
Cc: Grass Developers List <grass5@grass.itc.it>, <dcalvelo@minag.gob.pe>
Subject: Re: [GRASS5] Re: d.vect.thematic wish: handle null values

On Fri, August 12, 2005 0:13, Michael Barton said:

I'm not sure that nulls are the problem. It won't accept the <= operator.
Try straight > and < and see what happens. Also, what is FNR=1?

The operators are not defined by me but by d.vect.thematic.
The problem is not the <= operator, but the fact that in the sql statement

SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=

there should be an integer after the <=.

This integer is missing because the awk script tries to divide by zero because
of the null values.

(FNR is an internal awk variable which stands for the "input record number in
the current input file")

I can confirm that it is the null values that are the problem since if I
replace them with any integer value (0, -99999, etc) d.vect.thematic runs
perfectly, but it creates the classes including the null-replacing integer
since it doesn't know that they are special values, and there is no way of
telling it this.

This is why I plead for one of the two solutions below...

Hope this helps clarifying the issue.

All the best,

Moritz

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Thu, 11 Aug 2005 23:20:00 +0200 (CEST)
To: Grass Developers List <grass5@grass.itc.it>
Cc: <michael.barton@asu.edu>, <dcalvelo@minag.gob.pe>
Subject: d.vect.thematic wish: handle null values

Hi,

Trying to make maps with d.vect.thematic I have stumbled upon the problem
that
the module fails when there are null values in the table (postgresql):

**********
awk: (FILENAME=- FNR=1) fatal: division by zero attempted

Thematic map legend for column revmed of map ssbxl01

Value range: - 33079
Mapped by 7 intervals of

Color(R:G:B) Value
============ ==========
0:0:250 7 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed >= 7 and revmed<=
ERREUR: L'opérateur n'existe pas : integer <=
HINT: Aucun opérateur correspond au nom donné et aux types d'arguments.
Vous
devez ajouter des conversions explicites de type.

35:0:215 -
DBMI-Postgres driver error:
Cannot select:
SELECT cat FROM ssbxl01 WHERE revmed > and revmed<=
ERREUR: syntax error sur ou près de «and» at character 41

etc...
************

Now I can see two theoretical solutions to this:

1) handle null values
2) allow to give a value that represents null values (e.g. -99999)

For both of these solutions, d.vect.thematic should offer the choice of a
color for these values, so that they do not use the same color scheme as the
'valid' values.

Moritz

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

On Fri, August 12, 2005 1:49, Michael Barton said:

Thanks for the clarification Moritz. Is this in the custom breaks section?

No, this happens independently of the 'themecalc' choice.

This was written by Daniel Calvelo. I'm copying him in case he has any
ideas.

He already was in the cc-list, so he got your last message twice :wink:

Moritz

On Fri, August 12, 2005 11:27, Moritz Lennert said:

On Fri, August 12, 2005 1:49, Michael Barton said:

Thanks for the clarification Moritz. Is this in the custom breaks section?

No, this happens independently of the 'themecalc' choice.

Sorry, I lied: actually the custom breaks are the only ones that work...

All the others fail, with more or less the same error message...

Moritz

Moritz,

In that case, it's in v.univar.sh. I'm not sure how to treat the null value
issue. It's an awk thing. They should be excluded from calculations and not
treated as 0's

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Fri, 12 Aug 2005 11:27:32 +0200 (CEST)
To: Michael Barton <michael.barton@asu.edu>
Cc: "daniel.calvelo@minag.gob.pe" <daniel.calvelo@minag.gob.pe>, Grass
Developers List <grass5@grass.itc.it>
Subject: Re: [GRASS5] Re: d.vect.thematic wish: handle null values

On Fri, August 12, 2005 1:49, Michael Barton said:

Thanks for the clarification Moritz. Is this in the custom breaks section?

No, this happens independently of the 'themecalc' choice.

This was written by Daniel Calvelo. I'm copying him in case he has any
ideas.

He already was in the cc-list, so he got your last message twice :wink:

Moritz

Then this is definitely in v.univar.sh

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Reply-To: <mlennert@club.worldonline.be>
Date: Fri, 12 Aug 2005 11:42:51 +0200 (CEST)
To: <mlennert@club.worldonline.be>
Cc: Michael Barton <michael.barton@asu.edu>, "daniel.calvelo@minag.gob.pe"
<daniel.calvelo@minag.gob.pe>, Grass Developers List <grass5@grass.itc.it>
Subject: Re: [GRASS5] Re: d.vect.thematic wish: handle null values

On Fri, August 12, 2005 11:27, Moritz Lennert said:

On Fri, August 12, 2005 1:49, Michael Barton said:

Thanks for the clarification Moritz. Is this in the custom breaks section?

No, this happens independently of the 'themecalc' choice.

Sorry, I lied: actually the custom breaks are the only ones that work...

All the others fail, with more or less the same error message...

Moritz

I think that this is the place in v.univar.sh that needs to be updated to
deal with null values

# calculate statistics
echo "Calculating statistics..."
cat $TMP | awk 'BEGIN {sum = 0.0 ; sum2 = 0.0}
NR == 1{min = $1 ; max = $1}
       {sum += $1 ; sum2 += $1 * $1 ; N++}
       {
        if ($1 > max) {max = $1}
        if ($1 < min) {min = $1}
       }
END{
print ""
print "Number of values:",N
print "Minimum:",min
print "Maximum:",max
print "Range:",max-min
print "-----"
print "Mean:",sum/N
print "Variance:",(sum2 - sum*sum/N)/N
print "Standard deviation:",sqrt((sum2 - sum*sum/N)/N)
print "Coefficient of variation:",(sqrt((sum2 -
sum*sum/N)/N))/(sqrt(sum*sum)/N)
print "-----"
}'

I think there needs to be the equivalent of

for $1<>""

in the beginning of this. Any awk expert know how to implement this?

Of course if v.univar were expanded to accomplish the same things as
v.univar.sh ...

Michael

On 8/12/05 2:42 AM, "Moritz Lennert" <mlennert@club.worldonline.be> wrote:

On Fri, August 12, 2005 11:27, Moritz Lennert said:

On Fri, August 12, 2005 1:49, Michael Barton said:

Thanks for the clarification Moritz. Is this in the custom breaks section?

No, this happens independently of the 'themecalc' choice.

Sorry, I lied: actually the custom breaks are the only ones that work...

All the others fail, with more or less the same error message...

Moritz

Hi all,

Sorry for jumping in so late, I've been traveling.

AFAIR the problem is actually within db.select's output. Moritz, could you
apply this patch to v.univar.sh and send the result (in /tmp/db.select.out)
back to me? When I was dealing with this problem I only used the DBF driver,
which gave zeros for nulls under unspecified cirsumstances.

I can fix the script to filter out nulls, but I'm pretty sure the problem will
remain under certain configurations.

-- Daniel Calvelo Aros

---------- Original Message -----------
From: Michael Barton <michael.barton@asu.edu>
To: mlennert@club.worldonline.be
Cc: "daniel.calvelo@minag.gob.pe" <daniel.calvelo@minag.gob.pe>, Grass
Developers List <grass5@grass.itc.it>
Sent: Sat, 13 Aug 2005 17:12:29 -0700
Subject: Re: [GRASS5] Re: d.vect.thematic wish: handle null values

I think that this is the place in v.univar.sh that needs to be
updated to deal with null values

# calculate statistics
echo "Calculating statistics..."
cat $TMP | awk 'BEGIN {sum = 0.0 ; sum2 = 0.0}
NR == 1{min = $1 ; max = $1}
       {sum += $1 ; sum2 += $1 * $1 ; N++}
       {
        if ($1 > max) {max = $1}
        if ($1 < min) {min = $1}
       }
END{
print ""
print "Number of values:",N
print "Minimum:",min
print "Maximum:",max
print "Range:",max-min
print "-----"
print "Mean:",sum/N
print "Variance:",(sum2 - sum*sum/N)/N
print "Standard deviation:",sqrt((sum2 - sum*sum/N)/N)
print "Coefficient of variation:",(sqrt((sum2 -
sum*sum/N)/N))/(sqrt(sum*sum)/N)
print "-----"
}'

I think there needs to be the equivalent of

for $1<>""

in the beginning of this. Any awk expert know how to implement this?

Of course if v.univar were expanded to accomplish the same things as
v.univar.sh ...

Michael

On 8/12/05 2:42 AM, "Moritz Lennert" <mlennert@club.worldonline.be> wrote:

> On Fri, August 12, 2005 11:27, Moritz Lennert said:
>> On Fri, August 12, 2005 1:49, Michael Barton said:
>>> Thanks for the clarification Moritz. Is this in the custom breaks section?
>>
>> No, this happens independently of the 'themecalc' choice.
>
> Sorry, I lied: actually the custom breaks are the only ones that work...
>
> All the others fail, with more or less the same error message...
>
> Moritz
>

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

------- End of Original Message -------

(attachments)

aa.patch (365 Bytes)

On Mon, August 15, 2005 20:50, Daniel Calvelo said:

Hi all,

Sorry for jumping in so late, I've been traveling.

AFAIR the problem is actually within db.select's output. Moritz, could you
apply this patch to v.univar.sh and send the result (in /tmp/db.select.out)
back to me?

done off-list.

When I was dealing with this problem I only used the DBF driver,
which gave zeros for nulls under unspecified cirsumstances.

Thinking about this a bit more this seems more like a bug to me. Making nulls
into zeroes gives very different results than if nulls are excluded from the
calculations.

At the user should be warned about this.
As I mentioned earlier, we have to think a about whether it is better to
silently exclude nulls, or whether we should explicitely give the opportunity
to exclude a certain value (or range or values) from the classification. This
would then allow those with nulls in their files to transform them to any
value (e.g. -99999) and exclude it from the calculations.

I can fix the script to filter out nulls, but I'm pretty sure the problem will
remain under certain configurations.

Maybe above solution is more generic. But in any case nulls should be handled
more gracefully...or at least a warning included in the man page...

Moritz

From: "Moritz Lennert" <mlennert@club.worldonline.be>
Sent: Sun, 28 Aug 2005 11:53:34 +0200 (CEST)

> When I was dealing with this problem I only used the DBF driver,
> which gave zeros for nulls under unspecified cirsumstances.

Thinking about this a bit more this seems more like a bug to me.
Making nulls into zeroes gives very different results than if nulls
are excluded from the calculations.

At the user should be warned about this.
As I mentioned earlier, we have to think a about whether it is
better to silently exclude nulls, or whether we should explicitely
give the opportunity to exclude a certain value (or range or values)
from the classification. This would then allow those with nulls in
their files to transform them to any value (e.g. -99999) and exclude
it from the calculations.

> I can fix the script to filter out nulls, but I'm pretty sure the
> problem will
> remain under certain configurations.

Maybe above solution is more generic. But in any case nulls should
be handled more gracefully...or at least a warning included in the
man page...

I fixed CVS to *ignore* nulls.

I'd say that is a good choice for use in d.vect.thematic. If you really really
want to use a symbol/color for NULL values, there is always the possibility of
overlaying d.vect.thematic with a d.vect where="the_field is null".

Furthermore, current db.execute allows for things like 'update table set
field=cat+23 where field is null', which allows to take care of any case I can
think of, either inserting new fields and marking them appropriately or
modifying existing fields in-place.

Daniel.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Calvelo Aros wrote:

From: "Moritz Lennert" <mlennert@club.worldonline.be>
Sent: Sun, 28 Aug 2005 11:53:34 +0200 (CEST)

When I was dealing with this problem I only used the DBF driver,
which gave zeros for nulls under unspecified cirsumstances.

Thinking about this a bit more this seems more like a bug to me.
Making nulls into zeroes gives very different results than if nulls
are excluded from the calculations.

At the user should be warned about this.
As I mentioned earlier, we have to think a about whether it is
better to silently exclude nulls, or whether we should explicitely
give the opportunity to exclude a certain value (or range or values)
from the classification. This would then allow those with nulls in
their files to transform them to any value (e.g. -99999) and exclude
it from the calculations.

I can fix the script to filter out nulls, but I'm pretty sure the
problem will
remain under certain configurations.

Maybe above solution is more generic. But in any case nulls should
be handled more gracefully...or at least a warning included in the
man page...

I fixed CVS to *ignore* nulls.

Great, now it works for me. Thanks !

I'd say that is a good choice for use in d.vect.thematic. If you really really
want to use a symbol/color for NULL values, there is always the possibility of
overlaying d.vect.thematic with a d.vect where="the_field is null".

Furthermore, current db.execute allows for things like 'update table set
field=cat+23 where field is null', which allows to take care of any case I can
think of, either inserting new fields and marking them appropriately or
modifying existing fields in-place.

I agree.

BTW: I just committed a slight addition to the legend drawing
routine for drawing black lines around the legend boxes. Just
thought it looks nicer that way...now if only these legends could be
exported in a more usable form...(will commit a wish for this just
for memory)

Moritz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDFF4srIrMbm76jD8RAstWAJ9MM6l3oLGxBfLYQNxtwUMw5Pj4agCeKPbO
lOXYOACxZdQwlciec5Z2rsY=
=nS1Z
-----END PGP SIGNATURE-----

Moritz,

This is great.

Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Moritz Lennert <mlennert@club.worldonline.be>
Date: Tue, 30 Aug 2005 15:25:00 +0200
To: Daniel Calvelo <daniel.calvelo@minag.gob.pe>
Cc: <grass5@grass.itc.it>, <michael.barton@asu.edu>
Subject: Re: [GRASS5] Re: d.vect.thematic wish: handle null values

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Calvelo Aros wrote:

From: "Moritz Lennert" <mlennert@club.worldonline.be>
Sent: Sun, 28 Aug 2005 11:53:34 +0200 (CEST)

When I was dealing with this problem I only used the DBF driver,
which gave zeros for nulls under unspecified cirsumstances.

Thinking about this a bit more this seems more like a bug to me.
Making nulls into zeroes gives very different results than if nulls
are excluded from the calculations.

At the user should be warned about this.
As I mentioned earlier, we have to think a about whether it is
better to silently exclude nulls, or whether we should explicitely
give the opportunity to exclude a certain value (or range or values)
from the classification. This would then allow those with nulls in
their files to transform them to any value (e.g. -99999) and exclude
it from the calculations.

I can fix the script to filter out nulls, but I'm pretty sure the
problem will
remain under certain configurations.

Maybe above solution is more generic. But in any case nulls should
be handled more gracefully...or at least a warning included in the
man page...

I fixed CVS to *ignore* nulls.

Great, now it works for me. Thanks !

I'd say that is a good choice for use in d.vect.thematic. If you really
really
want to use a symbol/color for NULL values, there is always the possibility
of
overlaying d.vect.thematic with a d.vect where="the_field is null".

Furthermore, current db.execute allows for things like 'update table set
field=cat+23 where field is null', which allows to take care of any case I
can
think of, either inserting new fields and marking them appropriately or
modifying existing fields in-place.

I agree.

BTW: I just committed a slight addition to the legend drawing
routine for drawing black lines around the legend boxes. Just
thought it looks nicer that way...now if only these legends could be
exported in a more usable form...(will commit a wish for this just
for memory)

Moritz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDFF4srIrMbm76jD8RAstWAJ9MM6l3oLGxBfLYQNxtwUMw5Pj4agCeKPbO
lOXYOACxZdQwlciec5Z2rsY=
=nS1Z
-----END PGP SIGNATURE-----