[GRASS-dev] using G_parser() --script in the wild

Hi,

WRT d.path bug #302, d.path from GIS.m runs without a backdrop map.
http://wald.intevation.org/tracker/index.php?func=detail&aid=302

After coding a solution in the d.path C code using D_get_dig_list() to
check if the target map was already drawn, and if not draw it by closing
the driver, running G_system(d.vect), and then opening the driver again,
I decided that that solution was fragile and not so good.

So I decided to to it with a wrapper shell script instead using the new
--script parser option. (see attached)

Problem:
the template script calls g.parser (which resets "$@" command line
options to "@ARGS_PARSED@") before I can get my hands on it, and I'd
rather not have to code all the command line options by hand.

Can we add a line in G_script() before the g.parser call like:

GRASS_CL_OPTS="$@"

if [ "$1" != "@ARGS_PARSED@" ] ; then
  exec g.parser "$0" "$@"
fi

?

Portability: currently I have it creating a temp file and then making
it executable with chmod. Would it be more portable to add a
permissions=777 option to g.tempfile to have this done during creation?
Is that in any way translateable to native Windows? if it will only work
with unix, may as well stick to chmod? (a -d create directory flag would
be nice too)

thanks,
Hamish

(attachments)

gen_d.path_wrapper.sh (634 Bytes)

Hamish wrote:

WRT d.path bug #302, d.path from GIS.m runs without a backdrop map.
http://wald.intevation.org/tracker/index.php?func=detail&aid=302

After coding a solution in the d.path C code using D_get_dig_list() to
check if the target map was already drawn, and if not draw it by closing
the driver, running G_system(d.vect), and then opening the driver again,
I decided that that solution was fragile and not so good.

So I decided to to it with a wrapper shell script instead using the new
--script parser option. (see attached)

Problem:
the template script calls g.parser (which resets "$@" command line
options to "@ARGS_PARSED@") before I can get my hands on it, and I'd
rather not have to code all the command line options by hand.

Can we add a line in G_script() before the g.parser call like:

GRASS_CL_OPTS="$@"

if [ "$1" != "@ARGS_PARSED@" ] ; then
  exec g.parser "$0" "$@"
fi

If you want such a line, add it to the resulting script manually. The
--script option is meant to create an initial "draft", not a complete
script.

Note that the assignment would have to be performed when the script is
initially run, not when it is invoked by g.parser (otherwise
$GRASS_CL_OPTS would also be equal to @ARGS_PARSED@). It would have to
be exported, so that its value is present when the script is invoked
by g.parser.

Also, note that such a solution is fragile; it won't work if any of
the arguments contain spaces (or possibly other shell metacharacters).
Essentially, there's no way to mimic the behaviour of "$@". If you
enclose a variable substitution in quotes, it will be treated as a
single word; if you don't use quotes, it will be split at each space
character. There's no way to make a copy of the argument list such
that you can split it back into words at the original boundaries, in
the way that "$@" works.

Ultimately, if you want to pass a list of strings between processes,
you have to find some way to encode that list as a single string, and
to decode it back into a list.

Portability: currently I have it creating a temp file and then making
it executable with chmod.

Ah. So you're writing a script which uses --script to create another
script which it then executes? That isn't the intended purpose of
--script. And it's potentially fragile; e.g. will it work if the
script is on a partition mounted with the "noexec" option (which may
well be the case for any partition to which you have write
permission)?.

I suggest simply writing a d.path.sh script which runs d.vect followed
by d.path, using the output from "d.path --script" as a template.

Writing the d.path command isn't *that* much work:

  if [ "$GIS_FLAG_G" = "1" ] ; then
      gflag=-g
  else
      gflag=
  fi

  if [ "$GIS_OPT_AFCOL" = "" ] ; then
      afcol=
  else
      afcol="afcol=$GIS_OPT_AFCOL"
  fi

  if [ "$GIS_OPT_ABCOL" = "" ] ; then
      abcol=
  else
      abcol="abcol=$GIS_OPT_ABCOL"
  fi

  if [ "$GIS_OPT_NCOL" = "" ] ; then
      ncol=
  else
      ncol="ncol=$GIS_OPT_NCOL"
  fi

  exec d.path $gflag $afcol $abcol $ncol \
    "map=$GIS_OPT_MAP" \
    "type=$GIS_OPT_TYPE" \
    "alayer=$GIS_OPT_ALAYER" \
    "nlayer=$GIS_OPT_NLAYER" \
    "color=$GIS_OPT_COLOR" \
    "hcolor=$GIS_OPT_HCOLOR" \
    "bgcolor=$GIS_OPT_BGCOLOR"

Hmm. I'm wondering whether it would be feasible to make G_parser() set
opt->answer = NULL if the value is an empty string (although that's
problematic if any module allows an empty string to be used where it
isn't the default value). Or provide some other mechanism for passing
NULL as an option value.

Most modules could be written to treat an empty string the same as
null, but re-writing modules involves a lot of work, and having to
remember to catch this case is error-prone.

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:
> WRT d.path bug #302, d.path from GIS.m runs without a backdrop map.
> http://wald.intevation.org/tracker/index.php?func=detail&aid=302

..

> So I decided to to it with a wrapper shell script instead using the
> new --script parser option. (see attached)
>
> Problem:
> the template script calls g.parser (which resets "$@" command line
> options to "@ARGS_PARSED@") before I can get my hands on it, and I'd
> rather not have to code all the command line options by hand.
>
> Can we add a line in G_script() before the g.parser call like:

[updated patch]
Index: parser.c

RCS file: /home/grass/grassrepository/grass6/lib/gis/parser.c,v
retrieving revision 1.103
diff -u -r1.103 parser.c
--- parser.c 27 Feb 2007 02:36:46 -0000 1.103
+++ parser.c 28 Feb 2007 05:39:50 -0000
@@ -1630,6 +1630,8 @@
                "fi\n"
                "\n"
                "if [ \"$1\" != \"@ARGS_PARSED@\" ] ; then\n"
+ " GRASS_CL_OPTS=\"$@\"\n"
+ " export GRASS_CL_OPTS\n"
                " exec g.parser \"$0\" \"$@\"\n"
                "fi\n"
                "\n"

It would also be nice to automatically add a comment line after
#!/bin/sh
like:
# script automatically generated from $MODULE at $DATE by $USER

updated auto-gen script using that attached, but due to other parser
path issues won't run from the GUI (see below). It runs from the command
line if you supply a map name and comment out "--ui" on the line that
executes the $TMP file.

Glynn wrote:

If you want such a line, add it to the resulting script manually. The
--script option is meant to create an initial "draft", not a complete
script.

That won't work for automatic script creation (without awk/sed tricks).
It has to come before the "exec g.parser" which is by laid down by
--script. (see above patch)

Also, note that such a solution is fragile; it won't work if any of
the arguments contain spaces (or possibly other shell metacharacters).

Ok. For d.path that's not an issue.

Essentially, there's no way to mimic the behaviour of "$@". If you
enclose a variable substitution in quotes, it will be treated as a
single word; if you don't use quotes, it will be split at each space
character. There's no way to make a copy of the argument list such
that you can split it back into words at the original boundaries, in
the way that "$@" works.

What if the g.parser module looped over argv and then putenv'd the
resulting string as GRASS_CL_OPTS? (always)

Ultimately, if you want to pass a list of strings between processes,
you have to find some way to encode that list as a single string, and
to decode it back into a list.

> Portability: currently I have it creating a temp file and then
> making it executable with chmod.

Ah. So you're writing a script which uses --script to create another
script which it then executes?

Yes.

That isn't the intended purpose of

I get a lot of that.

--script. And it's potentially fragile; e.g. will it work if the
script is on a partition mounted with the "noexec" option (which may
well be the case for any partition to which you have write
permission)?.

Admin rules denying users the right to even run their own shell scripts
is a bit harsh. (but considering the [Windows] IT sys admins I know
maybe not too far fetched) I guess $GISDBASE may be outside of $HOME..

But it doesn't work anyway, or at least from the GUI menu. When you hit
"Run" in the GUI [auto-generated script], it strips off the basename
path (which is $MAPSET/.tmp/$HOSTNAME/), and as that isn't in the PATH
it can't find the script, and you just see a big red "X". I guess I
could have it add $TMP to the $PATH for the duration of the module, but
that sounds like a security hole waiting to happen, and doesn't fix your
noexec mount rules case.

And this is a more general problem that affects ALL user scripts -- user
scripts must be somewhere in the PATH to Run from tcl GUI mode! They
work
ok from the command line regardless of the script's path.

I suggest simply writing a d.path.sh script which runs d.vect followed
by d.path, using the output from "d.path --script" as a template.

Probably what we'll have to end up doing, with the small gen script
copied into the CVS log (or script comments) for future reference.

Writing the d.path command isn't *that* much work:

..

  exec d.path $gflag $afcol $abcol $ncol \
    "map=$GIS_OPT_MAP" \
    "type=$GIS_OPT_TYPE" \
    "alayer=$GIS_OPT_ALAYER" \
    "nlayer=$GIS_OPT_NLAYER" \
    "color=$GIS_OPT_COLOR" \
    "hcolor=$GIS_OPT_HCOLOR" \
    "bgcolor=$GIS_OPT_BGCOLOR"

Yeah, but I was hoping to automate it to avoid future maintenence
problems like the sorry state that the GRASS 5 tcltkgrass modules and
man pages were in before we moved to auto-generated module GUIs and help
page headers.

Hmm. I'm wondering whether it would be feasible to make G_parser() set
opt->answer = NULL if the value is an empty string (although that's
problematic if any module allows an empty string to be used where it
isn't the default value). Or provide some other mechanism for passing
NULL as an option value.

Most modules could be written to treat an empty string the same as
null, but re-writing modules involves a lot of work, and having to
remember to catch this case is error-prone.

G_define_option() already inits opt->answer=NULL. Is option="" a
change to that? I seem to distantly remember adding a test for that to
some module where option="" was causing a segfault.

But I don't think it is needed here, just build up an options string.
If $opt1 is "", then the exec command just has an extra space in it.

for example:

optstring=""
if [ "$GIS_FLAG_G" -eq 1 ] ; then
  optstring="$optstring -g"
fi
if [ -n "$GIS_OPT_MAP" ] ; then
  optstring="$optstring map=\"$GIS_OPT_MAP\""
fi
exec d.path $optstring

or if you think optstring="$optstring ..." is evil,

if [ "$GIS_FLAG_G" -eq 1 ] ; then
    gflag="-g"
else
    gflag=""
fi
if [ -n "$GIS_OPT_AFCOL" ] ; then
    afcol="afcol=\"$GIS_OPT_AFCOL\""
else
    afcol=""
fi
[...]
exec d.path $gflag $afcol [...]

Hamish

(attachments)

gen_d.path_wrapper.sh (648 Bytes)

Hamish wrote:

> Essentially, there's no way to mimic the behaviour of "$@". If you
> enclose a variable substitution in quotes, it will be treated as a
> single word; if you don't use quotes, it will be split at each space
> character. There's no way to make a copy of the argument list such
> that you can split it back into words at the original boundaries, in
> the way that "$@" works.

What if the g.parser module looped over argv and then putenv'd the
resulting string as GRASS_CL_OPTS? (always)

That would be ugly, but it would work. Each argv[i] would need to be
enclosed in single quotes, with each embedded single quote replaced
with '\'' (quote, backslash, quote, quote).

> That isn't the intended purpose of

I get a lot of that.

> --script. And it's potentially fragile; e.g. will it work if the
> script is on a partition mounted with the "noexec" option (which may
> well be the case for any partition to which you have write
> permission)?.

Admin rules denying users the right to even run their own shell scripts
is a bit harsh. (but considering the [Windows] IT sys admins I know
maybe not too far fetched) I guess $GISDBASE may be outside of $HOME..

For scripts, you can run "sh script.sh" (unless sh has been patched to
prohibit execution of scripts residing on noexec filesystems; some
distributions do this), or even "sh < script.sh" (but then you can't
pass arguments).

It's more likely that problems would occur due to the specific
directory containing the script being on a noexec filesystem than a
general security policy; e.g. NFS shares are frequently mounted noexec
because the same share may be used on multiple architectures, so
there's no guarantee that executables stored on the NFS share will
actually work on the client.

But it doesn't work anyway, or at least from the GUI menu. When you hit
"Run" in the GUI [auto-generated script], it strips off the basename
path (which is $MAPSET/.tmp/$HOSTNAME/), and as that isn't in the PATH
it can't find the script, and you just see a big red "X". I guess I
could have it add $TMP to the $PATH for the duration of the module, but
that sounds like a security hole waiting to happen, and doesn't fix your
noexec mount rules case.

And this is a more general problem that affects ALL user scripts -- user
scripts must be somewhere in the PATH to Run from tcl GUI mode! They
work
ok from the command line regardless of the script's path.

There's no guarantee that $0 is the full path to the script. When a
process calls execve(), it can pass whatever it likes for argv[0];
it's just convention that it's the name of the executable, where
"name" can either be the base name or the full path, or something else
altogether.

> Hmm. I'm wondering whether it would be feasible to make G_parser() set
> opt->answer = NULL if the value is an empty string (although that's
> problematic if any module allows an empty string to be used where it
> isn't the default value). Or provide some other mechanism for passing
> NULL as an option value.
>
> Most modules could be written to treat an empty string the same as
> null, but re-writing modules involves a lot of work, and having to
> remember to catch this case is error-prone.

G_define_option() already inits opt->answer=NULL. Is option="" a
change to that? I seem to distantly remember adding a test for that to
some module where option="" was causing a segfault.

NULL and "" aren't the same thing. Modules which test whether an
optional (non-required) option was used normally test whether
"opt->answer == NULL". The only way to create this situation is to not
specify the option in the command line. You can't pass NULL as an
option value, and passing an empty value (i.e. "opt=") will result in
opt->answer not being equal to NULL.

But I don't think it is needed here, just build up an options string.
If $opt1 is "", then the exec command just has an extra space in it.

for example:

optstring=""
if [ "$GIS_FLAG_G" -eq 1 ] ; then
  optstring="$optstring -g"
fi
if [ -n "$GIS_OPT_MAP" ] ; then
  optstring="$optstring map=\"$GIS_OPT_MAP\""
fi
exec d.path $optstring

or if you think optstring="$optstring ..." is evil,

It won't work with values which may contain spaces. If you wrap the
value in single quotes, it won't work with values which may contain
single quotes.

if [ "$GIS_FLAG_G" -eq 1 ] ; then
    gflag="-g"
else
    gflag=""
fi
if [ -n "$GIS_OPT_AFCOL" ] ; then
    afcol="afcol=\"$GIS_OPT_AFCOL\""
else
    afcol=""
fi
[...]
exec d.path $gflag $afcol [...]

That works; it's just ugly. The fact that afcol, abcol and ncol need
special treatment stands out, and lead me to consider whether there's
a better solution.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote:

NULL and "" aren't the same thing. Modules which test whether an
optional (non-required) option was used normally test whether
"opt->answer == NULL". The only way to create this situation is to not
specify the option in the command line. You can't pass NULL as an
option value, and passing an empty value (i.e. "opt=") will result in
opt->answer not being equal to NULL.

what are they equal to then?

as all opt->answer are stored as strings regardless of TYPE_* setting,
and C strings are null terminated, a string with nothing in it (opt="")
becomes opt->answer[0]=='\0', right?

Oh, I see now, it's a pointer and the mem address is no longer NULL even
though the content found at that memory address is.

slowly catching on,
Hamish

[wrapper script for d.path from the GUI with backdrop]

...

> build up an options string. If $opt1 is "", then the exec command
> just has an extra space in it.
>
> for example:
>
> optstring=""
> if [ "$GIS_FLAG_G" -eq 1 ] ; then
> optstring="$optstring -g"
> fi
> if [ -n "$GIS_OPT_MAP" ] ; then
> optstring="$optstring map=\"$GIS_OPT_MAP\""
> fi
> exec d.path $optstring
>
>
> or if you think optstring="$optstring ..." is evil,

It won't work with values which may contain spaces. If you wrap the
value in single quotes, it won't work with values which may contain
single quotes.

> if [ "$GIS_FLAG_G" -eq 1 ] ; then
> gflag="-g"
> else
> gflag=""
> fi
> if [ -n "$GIS_OPT_AFCOL" ] ; then
> afcol="afcol=\"$GIS_OPT_AFCOL\""
> else
> afcol=""
> fi
> [...]
> exec d.path $gflag $afcol [...]

That works; it's just ugly. The fact that afcol, abcol and ncol need
special treatment stands out, and lead me to consider whether there's
a better solution.

why do the column name options need special treatment?

none of d.path's other options should contain spaces or special chars,
so I don't think the quoting is critical.

If I quote the column options on the exec line I get this error:

if [ -n "$GIS_OPT_AFCOL" ] ; then
    opt_afcol="afcol=$GIS_OPT_AFCOL"
else
    opt_afcol=""
fi

[...]

exec d.path $flag_g $flag_b \
  "$opt_map" "$opt_type" \
  "$opt_alayer" "$opt_nlayer" \
  "$opt_afcol" "$opt_abcol" "$opt_ncol" \
  "$opt_color" "$opt_hcolor" "$opt_bgcolor"

"Sorry <> is not a valid option", followed by G_usage().

presumably because "" makes argc++, but the last few argv are empty, and
G_parser() loops through argc.

similar:
# invalid option error
d.erase grey \
""

# no error
d.erase grey\
""

anyway, any problems with the attached script?
will column names require quoting? (e.g. column="Jane's Data")
or is that illegal SQL?

Hamish

(attachments)

d.path_wrapper.sh (4.37 KB)

Hamish wrote:

> > if [ "$GIS_FLAG_G" -eq 1 ] ; then
> > gflag="-g"
> > else
> > gflag=""
> > fi
> > if [ -n "$GIS_OPT_AFCOL" ] ; then
> > afcol="afcol=\"$GIS_OPT_AFCOL\""
> > else
> > afcol=""
> > fi
> > [...]
> > exec d.path $gflag $afcol [...]
>
> That works; it's just ugly. The fact that afcol, abcol and ncol need
> special treatment stands out, and lead me to consider whether there's
> a better solution.

why do the column name options need special treatment?

Because their default values are NULL pointers, which cannot be
specified via the command line.

For all of the other cases, we can specify the option along with its
default value.

none of d.path's other options should contain spaces or special chars,
so I don't think the quoting is critical.

Oh, we could treat all of the options the same as afcol= etc, but
that's more code, and unnecessary.

If I quote the column options on the exec line I get this error:

if [ -n "$GIS_OPT_AFCOL" ] ; then
    opt_afcol="afcol=$GIS_OPT_AFCOL"
else
    opt_afcol=""
fi

[...]

exec d.path $flag_g $flag_b \
  "$opt_map" "$opt_type" \
  "$opt_alayer" "$opt_nlayer" \
  "$opt_afcol" "$opt_abcol" "$opt_ncol" \
  "$opt_color" "$opt_hcolor" "$opt_bgcolor"

"Sorry <> is not a valid option", followed by G_usage().

A pointer to an empty string (i.e. with NUL as the first byte) isn't
the same as a NULL pointer, and an empty string isn't a valid value
for the "afcol=" option.

The other options are either required or have a default value which
isn't a NULL pointer. E.g. type= defaults to "line,boundary"; if the
type= option isn't passed to the script, then:

  d.vect ... type=$GIS_OPT_TYPE

has the same effect as simply not passing the type= option; both will
result in type->answer pointing to "line,boundary".

However, for options whose default value is a NULL pointer, using

  d.vect ... afcol= ...

isn't the same as omitting it. Omitting it will result in
afcol->answer being NULL, while "afcol=" with an empty value will
result in afcol->answer pointing to an empty string.

anyway, any problems with the attached script?

The changes will work, but aren't necessary; they just increase the
size of the script without changing its behaviour.

will column names require quoting? (e.g. column="Jane's Data")
or is that illegal SQL?

SQL column names are limited to alphanumerics and underscore.

--
Glynn Clements <glynn@gclements.plus.com>