Hamish wrote:
this text is in a sprintf() statement & the string goes into a command
which is eventually fed through:
echo "$command" | wish &
That's the real bug; the rest are just symptoms.
The way that G_gui() currently works is absolutely crazy. For two main
reasons:
1. The actual text in parser.c is first being interpreted according to
the syntax of C string literals, then the result is being interpreted
according to the syntax of /bin/sh (due to the use of system()), then
that gets interpreted according to the syntax of "echo" (at least
that's only an issue for backslashes), then the ultimate result is
finally interpreted as Tcl code.
2. It ends up calling system() with a potentially massive argument (it
allocates 100Kb). Many Unices have a limit to the length of the
command line, and it's typically less than 100Kb; 4Kb is typical.
If you're planning on doing more work on this, you could probably save
yourself (and anyone else who works on it) some work by:
1. Eliminating the use of the shell and "echo", by using popen()
instead of system(). Then, whatever you write to the FILE* is sent
directly to wish's stdin, without any processing. Apart from avoiding
the processing, it also means that you can send it a bit at a time
with fprintf/fputs; you don't need to buffer the whole thing in
memory.
2. Moving all of the boilerplate into a prologue which defines
procedures. Then, G_gui() just needs to send e.g.
"source $env(GISBASE)/etc/prolog.tcl" at the start, and you only need
to generate the bits which will actually change between calls. Also,
the bulk text which is moved into the prologue wouldn't need to be
encoded as C string literals.
A couple of other points about that code:
1. In Tcl, everything is a string. You don't need to quote strings
unless they contain spaces, so e.g.:
append(cmd, "set opttype(%d) \"multi\" \n", optn);
can be written as just:
append(cmd, "set opttype(%d) multi \n", optn);
2. This sort of thing is asking for trouble:
append(cmd, " \"%s\" ", s );
If the value of s contains any Tcl metacharacters, the end result
probably isn't going to be what was desired. [With the current
implementation, the same applies to shell or echo metacharacters.]
It would be (slightly) more reliable to use braces, i.e.
append(cmd, " {%s} ", s );
However, that fails if the value of s itself contains braces, and
there isn't any way to include a literal left or right brace within
braces (backslash-brace will prevent the brace from being interpreted
for the purposes of nesting, but the backslash will be retained).
The only reliable solution is to use double quotes and to precede any
metacharacter (double quote, left bracket, dollar or backslash) with a
backslash. Which means you can't just use fprintf("... %s ..."); you
need a function which will scan the string and insert backslashes as
required.
References:
"Tcl quoting hell" http://wiki.tcl.tk/1726
Hehe.
Although you should really provide "bash quoting hell" and "echo
quoting hell" references if system("echo ...") is going to be used.
Actually, this kind of problem is inherent with any form of "in-band
signalling" mechanism, of which string literals (in most languages),
printf(), /bin/sh and Tcl are all examples. It's particularly
problematic when it forms the basis of an entire language (as is the
case for most "scripting" languages).
And it's bound to be problematic when you combine four different
in-band signalling mechanisms together (i.e. C string literals,
vsprintf, /bin/sh and Tcl).
[Aside: if you read BugTraq long enough, you will eventually realise
that 95% of security vulnerabilities occur for one of two reasons. One
is buffer overflows. The other is in-band signalling, i.e. where you
end up inserting user-supplied signalling codes when you thought that
you were just inserting data.]
--
Glynn Clements <glynn.clements@virgin.net>