[GRASS5] sql expression weirdnesses

Hi all,

I'm baffled by the following:

<session>
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST < 500)" |wc #Ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST/2+ALT_DIST/2 < 500)" |wc #Ooops.
    1829 1829 8033
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST*2 < 1000)" |wc #Ok
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST+ALT_DIST < 1000)" |wc #Still ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST+ALT_DIST/2 < 750)" |wc #Still ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST/2+ALT_DIST < 750)" |wc #Ooops. Non-commuting sum
    1829 1829 8033
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
((ALT_DIST/2)+ALT_DIST < 750)" |wc #Puzzled...
DBMI-DBF driver error:
SQL parser error in statement:
select cat from dist2002 where ((ALT_DIST/2)+ALT_DIST < 750)
Error in db_open_select_cursor()

       0 0 0
</session>

Is there a (hopefully documented) limitation in the sql parser that provokes
these things? Is it some NULL-related weirdness?

Background: I'm trying to implement expression themeing for d.vect.thematic. I
found that since expressions are allowed in where clauses, it is enough to
make v.univar.sh calculate correct statistics for expressions and then let the
sql parser deal with the corresponding expression. I'm stuck with behaviour as
exemplified above.

This is grass6 as in Lorenzo Moretti's distribution, unpatched, under MacOSX
10.3.9. FYI:

<session>
GRASS 6.0.0 (pe):~/t/thematic/d.vect.thematic > db.connect -p
driver:dbf
database:$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/
schema:(null)
group:(null)
GRASS 6.0.0 (pe):~/t/thematic/d.vect.thematic > v.info -c dist2002
Displaying column type for database connection of layer 1:
INTEGER|CAT
CHARACTER|CODIGO_DEP
CHARACTER|NOMB_DEP
CHARACTER|CODIGO_PRO
CHARACTER|NOMB_PROV
CHARACTER|CODIGO_DIS
CHARACTER|NOMB_DIST
CHARACTER|DCTO
CHARACTER|LEY
CHARACTER|FECHA
CHARACTER|NOM_CAP
CHARACTER|CAT_CAP
INTEGER|ALT_DIST
CHARACTER|LAT_DIST
CHARACTER|LONG_DIST
CHARACTER|REG_NAT
CHARACTER|REGNAT2
INTEGER|UG_N
CHARACTER|grassrgb
</session>

Any hints?

Cheers,

Daniel.

-- Daniel Calvelo Aros

Daniel Calvelo Aros wrote:

I'm baffled by the following:

<session>
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST < 500)" |wc #Ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST/2+ALT_DIST/2 < 500)" |wc #Ooops.
    1829 1829 8033
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST*2 < 1000)" |wc #Ok
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST+ALT_DIST < 1000)" |wc #Still ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST+ALT_DIST/2 < 750)" |wc #Still ok.
     421 421 1809
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
(ALT_DIST/2+ALT_DIST < 750)" |wc #Ooops. Non-commuting sum
    1829 1829 8033
GRASS 6.0.0 (pe):~/t/ > db.select dist2002 sql="select cat from dist2002 where
((ALT_DIST/2)+ALT_DIST < 750)" |wc #Puzzled...
DBMI-DBF driver error:
SQL parser error in statement:
select cat from dist2002 where ((ALT_DIST/2)+ALT_DIST < 750)
Error in db_open_select_cursor()

       0 0 0
</session>

Is there a (hopefully documented) limitation in the sql parser that provokes
these things? Is it some NULL-related weirdness?

The SQL parser doesn't implement the normal precedence rules for
arithmetic operators, so e.g.:

  (ALT_DIST/2+ALT_DIST/2 < 500)

parses as:

  ((((ALT_DIST/2)+ALT_DIST)/2) < 500)

Also, it doesn't allow parentheses within arithmetic expressions, only
logical (boolean) expressions.

I don't have a copy of the SQL standard, but I'm pretty sure that both
of those are bugs.

At a glance, I would guess that the y_expression rule at the bottom of
lib/db/sqlp/yac.y should be split into something like:

y_expression:
    y_product { $$ = $1; }
  | y_expression '+' y_product {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("+"), $1, $3 );
    }
  | y_expression '-' y_product {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("-"), $1, $3 );
    }
  ;

y_product:
    y_atom { $$ = $1; }
  | y_expression '*' y_atom {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("*"), $1, $3 );
    }
  | y_expression '/' y_atom {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("/"), $1, $3 );
    }
  ;

y_atom:
    y_value { $$ = $1; }
  | y_column { $$ = $1; }
  | '(' y_expression ')' { $$ = $2; }
  ;

Also the ARITHMETICAL_OPERATOR token would need to be removed from
lex.l and yac.y, as its current definition makes it impossible to
treat / differently to + and -.

This gives * and / precedence of over + and -, forces
left-associativity, and allows parenthesised expressions.

--
Glynn Clements <glynn@gclements.plus.com>

From: Glynn Clements <glynn@gclements.plus.com>
Sent: Tue, 7 Jun 2005 04:39:41 +0100

The SQL parser doesn't implement the normal precedence rules for
arithmetic operators, so e.g.:

  (ALT_DIST/2+ALT_DIST/2 < 500)

parses as:

  ((((ALT_DIST/2)+ALT_DIST)/2) < 500)

Also, it doesn't allow parentheses within arithmetic expressions,
only logical (boolean) expressions.

So the set of acceptable expressions is very stringent. I'd take a wild guess
that infix left-associative with no parenthesis is less expressive that
"normal" infix: how would you rewrite a/(b+c)?

I don't have a copy of the SQL standard, but I'm pretty sure that
both of those are bugs.

Same here. SQL expressions of this kind are meant to be arithmetically intuitive.

At a glance, I would guess that the y_expression rule at the bottom
of lib/db/sqlp/yac.y should be split into something like:

y_expression:
    y_product { $$ = $1; }
  | y_expression '+' y_product {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("+"), $1, $3 );
    }
  | y_expression '-' y_product {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("-"), $1, $3 );
    }
  ;

y_product:
    y_atom { $$ = $1; }
  | y_expression '*' y_atom {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("*"), $1, $3 );
    }
  | y_expression '/' y_atom {
        $$ = sqpNewExpressionNode ( sqpOperatorCode("/"), $1, $3 );
    }
  ;

y_atom:
    y_value { $$ = $1; }
  | y_column { $$ = $1; }
  | '(' y_expression ')' { $$ = $2; }
  ;

Also the ARITHMETICAL_OPERATOR token would need to be removed from
lex.l and yac.y, as its current definition makes it impossible to
treat / differently to + and -.

This gives * and / precedence of over + and -, forces
left-associativity, and allows parenthesised expressions.

Thanks for looking into it. I feared it would be a parser bug. IMHO a serious
one... Well, time to dig back into yacc. Anybody feeling like doing it in my
not-fully-acquainted-with-dbmi place?

Daniel.

Daniel Calvelo Aros wrote:

> The SQL parser doesn't implement the normal precedence rules for
> arithmetic operators, so e.g.:

[snip]

Thanks for looking into it. I feared it would be a parser bug. IMHO a serious
one... Well, time to dig back into yacc. Anybody feeling like doing it in my
not-fully-acquainted-with-dbmi place?

Try the attached patch. It compiles OK (on Linux), but I haven't
tested it.

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

sqlp.diff (2.28 KB)