[GRASS5] I18N for GRASS: continued discussion

(attachments)

grass5.0_code_internationalization_status.txt (4.59 KB)

Alex Shevlakov wrote:

1. Hypothetically, I18N shall break GRASS on some systems where
dgettext is in libintl.

---------------------------

There is no evidence known to me when it broke anything. As this was
considered the major problem, I think we can learn and list what
systems will break, and why.

If we can list such systems, then we must place warnings to those
systems users who want NLS support, to be aware of '--with-nls' option
expected break on such systems.

We should fix the build process so that I18N works even on systems
where dgettext is in a separate library. This is the "normal"
situation; GNU libc is the exception.

The problem wasn't the I18N generally, but in the consequences of
internationalising libraries.

When individual modules were internationalised, the Gmakefile for that
module had $(INTLLIB) added to the link command, so that dgettext()
would be available.

Internationalising libgis would require that every module which linked
against libgis (i.e. every module) depended upon dgettext(). Rather
than adding $(INTLLIB) to every Gmakefile, the obvious solution is to
simply change src/CMD/generic/make.mid to use:

  GISLIB = -lgis $(INTLLIB)

This is simple enough, and I'm 99.9% sure that it's safe. However,
it's that last 0.1% that concerned me. If there turned out to be a
problem, it would have broken *everything*. Most users can get by
without the *.pg programs; they can't get by without libgis. Ditto for
the GRASS startup and, to a lesser extent, tcltkgrass.

The situation regarding libgis, startup, and tcltkgrass was "probably"
safe, so GRASS would "probably" have worked OK. I'm just allergic to
changes which "probably" won't completely break GRASS being made weeks
before the long-awaited release.

The real problem was that the change was only made at a point when the
5.0.0 release was looking imminent. If it had been done six months
earlier, I wouldn't have been so concerned.

Unfortunately, we don't have a systematic test program. The only
platform which gets regular testing is Linux/x86. AFAICT, occasional
testing occurs for Cygwin, MacOS X, Solaris and Irix. A whole batch of
Mac OS X problems were only discovered once 5.0.0 was released.

For comparison, the Linux kernel is about to go into feature freeze
(prompting a massive increase in the rate of patch submissions).
However, the actual 2.6 release isn't expected for another six months
yet. The Linux kernel has a hell of a lot more developers and testers
than does GRASS, and some of those are large corporations with rooms
full of systems for running test suites.

2. The I18N code modifications done to libraries make them system-wide
dangerous.

---------------------------

There is no difference in doing actual string replacement by gettext
_(macro) inside the code of both modules and libraries; the I18N is
done in the same way.

Therefore there is no difference in: a) being relatively safe with
modules internationalization and in b) being put in risk with
libraries internationalization, whatever risk one could foresee.

But what about the risks which one doesn't foresee? The most dangerous
assumptions are the ones which you don't know that you're making.

All
modules use libraries, but this should not mean that all modules will
break, nothing would break if I18N was done in duly way.

Correct. But what if it *wasn't* done quite right? What if there was
some unforeseen consequence of the libgis I18N?

From the bug reports which arose immediately after the release, we
know that none of the recent versions had been tested on the Mac (I'm
fairly sure that the "libedit" change pre-dated the libgis I18N, and
the problems with that weren't found until after the release).

3. Tcltkgrass and intro messages: locale is set up during install and for good,
which is not correct because it should be determined and chosen on startup
of GRASS.
---------------------------

This is right but misleading. Neither tcltkgrass messages nor
grass_into and license messages are defined on install. Actually, as
it follows from the [locale/Gmakefile], all possible locales are
installed into $(INST_DIR)/locale, including tcltkgrass messages and
text files, 'only if' GRASS was configured with NLS support.

Yes; I changed it; It originally used the value of $LANG at compile
time.

However, tcltkgrass is still only consulting LANG; for consistency
with C programs which use dgettext(), it should be checking:

  LC_ALL
  LC_MESSAGES
  LANG

in that order. Init.sh consults all three, but has the ordering wrong.

4. More testing is needed to find out other dangers from I18N looming,
before such code can be included into any stable release.

---------------------------

No other testing except the developer's would be possible if you do
not include the subjects to be tested into the release.

Well, some people did actually test the various -beta and -pre
releases, and even the bleeding-edge CVS versions. It just didn't
happen enough.

Now that 5.0.0 has finally been released, I'm less concerned about the
remote possibility of any I18N-related problems. However, I would
still prefer to see 5.0.1 being a bugfix-only release, with the
critical I18N changes (libgis, startup, tcltkgrass) saved for 5.0.2.

That needn't mean a long wait. Personally, I would have liked to have
seen 5.0.1 released as soon as the initial batch of MacOSX/gcc-3.x bug
reports had been dealt with, but Bernhard didn't seem too keen on
that.

--
Glynn Clements <glynn.clements@virgin.net>

Alex Shevlakov wrote:

> 1. Hypothetically, I18N shall break GRASS on some systems where
> dgettext is in libintl.
>
> ---------------------------
>
> There is no evidence known to me when it broke anything. As this was
> considered the major problem, I think we can learn and list what
> systems will break, and why.
>
> If we can list such systems, then we must place warnings to those
> systems users who want NLS support, to be aware of '--with-nls' option
> expected break on such systems.

We should fix the build process so that I18N works even on systems
where dgettext is in a separate library. This is the "normal"
situation; GNU libc is the exception.

The problem wasn't the I18N generally, but in the consequences of
internationalising libraries.

When individual modules were internationalised, the Gmakefile for that
module had $(INTLLIB) added to the link command, so that dgettext()
would be available.

Internationalising libgis would require that every module which linked
against libgis (i.e. every module) depended upon dgettext(). Rather
than adding $(INTLLIB) to every Gmakefile, the obvious solution is to
simply change src/CMD/generic/make.mid to use:

  GISLIB = -lgis $(INTLLIB)

This should work fine. Without NLS option, I don't see how this could
affect anything. If variable $(INTLLIB) was empty and someone requested
NLS support, it would be impossible to compile if dgettext was not in glibc,
but that's it. Vice versa, $(INTLLIB) non-empty and no NLS support enabled,
it would not affect anything as there wouldn't be calls to the library.

This is simple enough, and I'm 99.9% sure that it's safe. However,
it's that last 0.1% that concerned me. If there turned out to be a
problem, it would have broken *everything*. Most users can get by
without the *.pg programs; they can't get by without libgis. Ditto for
the GRASS startup and, to a lesser extent, tcltkgrass.

I have only to point out again that the question is not whether some modules
or libraries must be considered as something one can just get by without.
Instead, the whole of i18n changes can be 'got by without' not giving
'-with-nls' option.
If this option was proved to be dangerous, we warn you to be prepared,
nevertheless we leave it for ones who need it, including gettext in
libraries, start up and tcltkgrass.

> Therefore there is no difference in: a) being relatively safe with
> modules internationalization and in b) being put in risk with
> libraries internationalization, whatever risk one could foresee.

But what about the risks which one doesn't foresee? The most dangerous
assumptions are the ones which you don't know that you're making.

> All
> modules use libraries, but this should not mean that all modules will
> break, nothing would break if I18N was done in duly way.

Correct. But what if it *wasn't* done quite right? What if there was
some unforeseen consequence of the libgis I18N?

These bad consequences should be tried, at least. What if there are none?
The simplicity of the string replacement makes me think so.
Being not able to test it on some systems, I would notify the people
who use GRASS on these systems of the option of string replacement,
and that they are welcome to make it on their own risk and report
about the consequences.

Now that 5.0.0 has finally been released, I'm less concerned about the
remote possibility of any I18N-related problems. However, I would
still prefer to see 5.0.1 being a bugfix-only release, with the
critical I18N changes (libgis, startup, tcltkgrass) saved for 5.0.2.
--
Glynn Clements <glynn.clements@virgin.net>

Alex Shevlakov

Alex Shevlakov wrote:

> Internationalising libgis would require that every module which linked
> against libgis (i.e. every module) depended upon dgettext(). Rather
> than adding $(INTLLIB) to every Gmakefile, the obvious solution is to
> simply change src/CMD/generic/make.mid to use:
>
> GISLIB = -lgis $(INTLLIB)

This should work fine. Without NLS option, I don't see how this could
affect anything. If variable $(INTLLIB) was empty and someone requested
NLS support, it would be impossible to compile if dgettext was not in glibc,
but that's it. Vice versa, $(INTLLIB) non-empty and no NLS support enabled,
it would not affect anything as there wouldn't be calls to the library.

Um, not quite. Note that INTLLIB is defined regardless of whether NLS
support is enabled. If it gets set to the wrong value, the above could
break everything. At present, $(INTLLIB) is only used in Gmakefiles
for individual programs (primarily *.pg), so it's much less of a risk.

> This is simple enough, and I'm 99.9% sure that it's safe. However,
> it's that last 0.1% that concerned me. If there turned out to be a
> problem, it would have broken *everything*. Most users can get by
> without the *.pg programs; they can't get by without libgis. Ditto for
> the GRASS startup and, to a lesser extent, tcltkgrass.

I have only to point out again that the question is not whether some modules
or libraries must be considered as something one can just get by without.
Instead, the whole of i18n changes can be 'got by without' not giving
'-with-nls' option.
If this option was proved to be dangerous, we warn you to be prepared,
nevertheless we leave it for ones who need it, including gettext in
libraries, start up and tcltkgrass.

Just to make absolutely sure that I'm not misunderstood: I wasn't in
the least bit concerned about what might happen to people who used
--with-nls. If people use that flag, and it breaks stuff, then they
just have to start again without it. My concern is that it *might*
have caused problems even without --with-nls being used.

Every change, however small, *might* break something. If you're going
to make changes (even seemingly trivial ones) to almost every file in
libgis, do it several months before release, so that there's a
reasonable chance that even the infrequent Mac/Cygwin/Solaris/Irix
testers will get around to testing it.

Having said all that, my position remains that I would ideally like to
see 5.0.1 released Real Soon Now with just the bug fixes, and 5.0.2
with the I18N soon after. However, if Bernhard insists on another
delay of several months between 5.0.1 and 5.0.2, then the I18N should
probably just go straight into 5.0.1.

--
Glynn Clements <glynn.clements@virgin.net>

On Fri, Nov 15, 2002 at 10:52:10AM +0000, Glynn Clements wrote:
[...]

Every change, however small, *might* break something. If you're going
to make changes (even seemingly trivial ones) to almost every file in
libgis, do it several months before release, so that there's a
reasonable chance that even the infrequent Mac/Cygwin/Solaris/Irix
testers will get around to testing it.

Having said all that, my position remains that I would ideally like to
see 5.0.1 released Real Soon Now with just the bug fixes, and 5.0.2
with the I18N soon after. However, if Bernhard insists on another
delay of several months between 5.0.1 and 5.0.2, then the I18N should
probably just go straight into 5.0.1.

From my point of view there are no reasons *not* to release 5.0.1
next week. We have accumulated some bugfixes, so let's release.

Bernhard, the others? Are there objections?

Markus

Markus and others

I haven't had time to test the fixes but please make sure that the updates
for the new r.mapcalc are there, especially that it writes the expression
into
the history file of the newly created raster and that it behaves in the
same way as the old one
in interactive mode (with > prompt when a series of expressions is
written).
I know that Glynn has added these modifications but it would be useful to
run few tests
with r.mapcalc - I will try to get to it but I have had all kinds of
troubles with updating
and compiling the experimental version so I am not sure whether I can do it
in time
for the release. I agree that 5.0.1 should be released soon (besides other
things, working with r.mapcalc
without having the expression written in history file has proven quite a
problem for us
and other heavy users of r.mapcalc may have troubles too).

thank you,

Helena

Neteler wrote:

On Fri, Nov 15, 2002 at 10:52:10AM +0000, Glynn Clements wrote:
[...]
> Every change, however small, *might* break something. If you're going
> to make changes (even seemingly trivial ones) to almost every file in
> libgis, do it several months before release, so that there's a
> reasonable chance that even the infrequent Mac/Cygwin/Solaris/Irix
> testers will get around to testing it.
>
> Having said all that, my position remains that I would ideally like to
> see 5.0.1 released Real Soon Now with just the bug fixes, and 5.0.2
> with the I18N soon after. However, if Bernhard insists on another
> delay of several months between 5.0.1 and 5.0.2, then the I18N should
> probably just go straight into 5.0.1.

From my point of view there are no reasons *not* to release 5.0.1
next week. We have accumulated some bugfixes, so let's release.

Bernhard, the others? Are there objections?

Markus

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

Helena Mitasova wrote:

I haven't had time to test the fixes but please make sure that the
updates for the new r.mapcalc are there, especially that it writes the
expression into the history file of the newly created raster and that
it behaves in the same way as the old one in interactive mode (with >
prompt when a series of expressions is written).

I know that Glynn has added these modifications

The only changes made to r.mapcalc since the release are:

+ Fix error handling (generate an error message instead of crashing
for undefined maps, variables or functions).

+ Fix the arity of the not() function.

+ Fix the handling of fully-qualified map names.

+ Allow interactive input to be terminated by "end" or "exit" as well
as a blank line or EOF. Add a message to indicate this.

The off-list discussion concluded that evaluating each expression
separately wasn't necessary (doing this for non-interactive use is
undesirable, and supporting separate interactive and non-interactive
evaluation strategies is non-trivial).

Updating the history file probably wouldn't be that hard in itself.
However, when reading from stdin, the original form of the expression
isn't currently stored anywhere, so you would get e.g. "add(a,mul(b,c))"
instead of "a+b*c". To fix this, I would need to add code to record the
input when reading from stdin.

There's also the issue of the behaviour of the && and || operators
with respect to null values. I.e. whether to implement the behaviour
described in the manpage, e.g.
  
  NULL || true = true
  NULL && false = false

--
Glynn Clements <glynn.clements@virgin.net>

Glynn Clements wrote:

Updating the history file probably wouldn't be that hard in itself.
However, when reading from stdin, the original form of the expression
isn't currently stored anywhere, so you would get e.g. "add(a,mul(b,c))"
instead of "a+b*c". To fix this, I would need to add code to record the
input when reading from stdin.

I've committed an update to CVS which stores the expression in the
history.

The expression is regenerated from the parse tree, rather than from
stored input. The main reason for taking this approach is that the
input can consist of multiple definitions, and you can't reliably tell
where the boundaries lie without actually parsing the input.

Consequently, using stored input would require that the complete input
was stored for each map. If the input consisted of a large number of
expressions, or the expressions were particularly complex, this could
result in the history being truncated, so the expressions for the
later maps wouldn't be stored at all.

I also changed certain aspects of the parser and the parse tree to
allow the original expression to be reconstructed more accurately.
Specifically, where the input uses infix operators, the stored version
will now use infix operators instead of prefix functions. Also, any
type conversion functions which are inserted by the type checking code
won't appear in the output.

Obviously, certain aspects of the input won't be reflected in the
stored version, e.g. the amount and type of whitespace, or extraneous
parentheses.

--
Glynn Clements <glynn.clements@virgin.net>

Any chance there was a typo in that fix that some compilers don't mind, but mine does ?

I was very excited to read that the history is back in, so updated my source code and tried compiling. On a linux/debian/AMD box, I got the following error when it was working on mapcalc3:

#################################################################
/usr/local/src/grass/src/raster/r.mapcalc3
   make -f OBJ.i686-pc-linux-gnu/make.rules

bison -y -d mapcalc.y
mapcalc.y:75.8: parse error, unexpected ":", expecting ";" or "|"
make: *** [y.tab.c] Error 1

I've looked at that line and the preceding ones in the source, and don't see anything obvious, but then again I don't have a clue how bison and friends work, so I'm just comparing to other lines in the same file.

Thanks for any suggestions,

Scott Mitchell

On Sunday, November 17, 2002, at 06:27 AM, Glynn Clements wrote:

Glynn Clements wrote:

Updating the history file probably wouldn't be that hard in itself.
However, when reading from stdin, the original form of the expression
isn't currently stored anywhere, so you would get e.g. "add(a,mul(b,c))"
instead of "a+b*c". To fix this, I would need to add code to record the
input when reading from stdin.

I've committed an update to CVS which stores the expression in the
history.

The expression is regenerated from the parse tree, rather than from
stored input. The main reason for taking this approach is that the
input can consist of multiple definitions, and you can't reliably tell
where the boundaries lie without actually parsing the input.

Consequently, using stored input would require that the complete input
was stored for each map. If the input consisted of a large number of
expressions, or the expressions were particularly complex, this could
result in the history being truncated, so the expressions for the
later maps wouldn't be stored at all.

I also changed certain aspects of the parser and the parse tree to
allow the original expression to be reconstructed more accurately.
Specifically, where the input uses infix operators, the stored version
will now use infix operators instead of prefix functions. Also, any
type conversion functions which are inserted by the type checking code
won't appear in the output.

Obviously, certain aspects of the input won't be reflected in the
stored version, e.g. the amount and type of whitespace, or extraneous
parentheses.

--
Glynn Clements <glynn.clements@virgin.net>

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

------
Scott W. Mitchell smitch@geog.utoronto.ca
Department of Geography Phone: (613) 730-5375
University of Toronto at Mississauga UTM fax: (905) 828-5273
3359 Mississauga Road Local fax (613) 822-5143
Mississauga, ON L5L 1C6 Canada

Scott W Mitchell wrote:

Any chance there was a typo in that fix that some compilers don't mind,
but mine does ?

Nope; the latest change to that file only affected lines 131 onwards.
There is a problem, but it was present as far back as -pre4.

I was very excited to read that the history is back in, so updated my
source code and tried compiling. On a linux/debian/AMD box, I got the
following error when it was working on mapcalc3:

#################################################################
/usr/local/src/grass/src/raster/r.mapcalc3
   make -f OBJ.i686-pc-linux-gnu/make.rules

bison -y -d mapcalc.y
mapcalc.y:75.8: parse error, unexpected ":", expecting ";" or "|"
make: *** [y.tab.c] Error 1

I've looked at that line and the preceding ones in the source, and
don't see anything obvious, but then again I don't have a clue how
bison and friends work, so I'm just comparing to other lines in the
same file.

AFAICT, the problem is that the previous rule isn't terminated with a
semicolon. It appears that my version of bison[1] doesn't care about
this, but yours does.

[1] Not only my version, but that of everyone else who has compiled
any version of GRASS since at least -pre4; byacc (Berkeley yacc)
doesn't care either.

I'm about to commit a fix to this, which essentially just adds a line
containing an indented semicolon at line 74:

program : stmts { $$ = result = $1; }
    ;

--
Glynn Clements <glynn.clements@virgin.net>

On Monday, November 18, 2002, at 11:33 PM, Glynn Clements wrote:

AFAICT, the problem is that the previous rule isn't terminated with a
semicolon. It appears that my version of bison[1] doesn't care about
this, but yours does.

Great, that fixed it, thanks. Now I get the same kind of error for r.weight, and to fix it, following your model, I had to add semicolons as you did to all the rules in the first ~250 lines - then it compiled fine.

[1] Not only my version, but that of everyone else who has compiled
any version of GRASS since at least -pre4; byacc (Berkeley yacc)
doesn't care either.

Yeah, my other systems are compiling this fine too. This one, however, has:
bouteloua:/usr/local/src/grass> bison --version
bison (GNU Bison) 1.75
Written by Robert Corbett and Richard Stallman.

It's installed by the "testing" release of Debian, so perhaps it's "coming soon" to all sorts of releases, I dunno... I checked, but don't see anything in the man page re: this issue changing, and all I see in the info pages is that sure enough they want a semi-colon at the end of each rule.

My Mac is "way back" at version 1.35, the solaris machine has 1.28. I've got (other) big problems compiling the current release on solaris, but will leave that for another day...

Thanks a lot...

Scott Mitchell

I'm about to commit a fix to this, which essentially just adds a line
containing an indented semicolon at line 74:

program : stmts { $$ = result = $1; }
    ;

--
Glynn Clements <glynn.clements@virgin.net>

Scott W Mitchell wrote:

> AFAICT, the problem is that the previous rule isn't terminated with a
> semicolon. It appears that my version of bison[1] doesn't care about
> this, but yours does.

Great, that fixed it, thanks. Now I get the same kind of error for
r.weight, and to fix it, following your model, I had to add semicolons
as you did to all the rules in the first ~250 lines - then it compiled
fine.

Right; I'll fix that too.

> [1] Not only my version, but that of everyone else who has compiled
> any version of GRASS since at least -pre4; byacc (Berkeley yacc)
> doesn't care either.

Yeah, my other systems are compiling this fine too. This one, however,
has:
bouteloua:/usr/local/src/grass> bison --version
bison (GNU Bison) 1.75
Written by Robert Corbett and Richard Stallman.

It's installed by the "testing" release of Debian, so perhaps it's
"coming soon" to all sorts of releases, I dunno... I checked, but
don't see anything in the man page re: this issue changing, and all I
see in the info pages is that sure enough they want a semi-colon at the
end of each rule.

Yep; it's odd that neither byacc nor (earlier version of) bison even
generate a warning.

--
Glynn Clements <glynn.clements@virgin.net>