[GRASS5] Re: V_ stuff

Hi Artemis, Huidae,
hi all,

below is a thread on asian characters support in GRASS which
will become feasible with a small change proposed by Artemis.
Huidae Cho has already implemented related improvements into
v.digit etc. It seems that the fix would help various people
so we should include it.

Read on below for details. Huidae, in case of no objections by
the others, could you update this in CVS?

Thanks

Markus

On Tue, May 01, 2001 at 08:27:41PM +0900, Huidae Cho wrote:

From: Markus Neteler <neteler@geog.uni-hannover.de>
>
> Hi Huidae,
>
> find attached a comment concerning asian characters support
> in GRASS. As I don't have any chance to test this here, maybe
> you could do with korean characters.
>
> Yours
>
> Markus
>
> Return-Path: <artemis@yandex.ru>
> Date: Mon, 30 Apr 2001 12:22:31 +0700
> From: Artemis <fgu@mail.gorny.ru>
> To: Markus Neteler <neteler@geog.uni-hannover.de>
> Subject: Re: CVS snapshot
> Message-ID: <20010430122231.A899@alarm.fgu.com>
> References: <20010426172204.B4115@alarm.fgu.com> <20010426144958.B591@hgeo02.geog.uni-hannover.de> <20010427144921.A1673@alarm.fgu.com> <20010427180105.H18685@hgeo02.geog.uni-hannover.de> <20010429050939.A8066@alarm.fgu.com> <20010429124154.B17685@hgeo02.geog.uni-hannover.de>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> User-Agent: Mutt/1.0.1i
> In-Reply-To: <20010429124154.B17685@hgeo02.geog.uni-hannover.de>; from neteler@geog.uni-hannover.de on Sun, Apr 29, 2001 at 12:41:54PM +0100
> Sender: Artemis <artemis@alarm.fgu.com>
>

[...]

> There's something you should know about Vask.
>
> V_support.c should have:
>
> in line 51 the substring:
> && (*ANS_PTR < '\177') *
> should be deleted.
>
> V_call.c:
>
> in line 254:
> & 0177
> should be deleted.
>
> in line 410:
> && (newchar < '\176')
> should be deleted.

Huidae Cho wrote:

Hi Markus,

Yes, actually it helps one type asian characters. So I updated v.digit in
this way and added DASIAN_CHARS into head.in some months ago.

If others are ok, what about adding --with-asian-chars to configure.in.
Without this flag, i have to add DASIAN_CHARS to head.* by hand after
configure.

Yours,
Huidae Cho

>
> In our company we successfully removed
> the substrings above and the interactive
> modules now allow to input chars above 128.
>
> At first, I thought the limitations in
> vask are needed because GRASS uses the higher
> characters somewhere, but looks like
> it works OK (but better to comment that
> out, not delete).
>
> Hope this would be useful,
>
> Artemis.
>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Markus Neteler wrote:

below is a thread on asian characters support in GRASS which
will become feasible with a small change proposed by Artemis.
Huidae Cho has already implemented related improvements into
v.digit etc. It seems that the fix would help various people
so we should include it.

Read on below for details. Huidae, in case of no objections by
the others, could you update this in CVS?

[...]
> > There's something you should know about Vask.
> >
> > V_support.c should have:
> >
> > in line 51 the substring:
> > && (*ANS_PTR < '\177') *
> > should be deleted.
> >
> > V_call.c:
> >
> > in line 254:
> > & 0177
> > should be deleted.
> >
> > in line 410:
> > && (newchar < '\176')
> > should be deleted.

> Yes, actually it helps one type asian characters. So I updated v.digit in
> this way and added DASIAN_CHARS into head.in some months ago.
>
> If others are ok, what about adding --with-asian-chars to configure.in.
> Without this flag, i have to add DASIAN_CHARS to head.* by hand after
> configure.

I don't see any point in making this optional; if GRASS copes with
8-bit characters, then it may as well be unconditional.

Presumably this applies to all non-ASCII character sets, not just
Asian ones (many people would assume that ASIAN_CHARS refers to
multi-byte encodings such as ISO-2022).

BTW, I noticed this in V_call.c:

  c = getch() & 0177;

This is the wrong thing to do, for several reasons:

1. As EOF is usually -1, the "&" is going to convert it to \177, so
the "if(c == EOF)" test on the following line will fail.

2. If someone enters a top-bit-set character, and GRASS can't cope
with them, the correct solution is to reject it, not to silently
convert it to some completely unrelated character.

3. GRASS *ought* to be able to cope with it. If there are bugs related
to char signed-ness, let's fix them rather than introducing a
workaround which ensures that they'll never be found.

Also, "newchar" should probably be an "int"; this avoids the issue of
top-bit-set characters being negative. "int" variables are also likely
to be more efficient than "char" (or, for that matter, "short").

Huidae; do you want to deal with this, or shall I? For now, I'll leave
it alone unless told otherwise.

--
Glynn Clements <glynn.clements@virgin.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

From: Glynn Clements <glynn.clements@virgin.net>

Markus Neteler wrote:

> below is a thread on asian characters support in GRASS which
> will become feasible with a small change proposed by Artemis.
> Huidae Cho has already implemented related improvements into
> v.digit etc. It seems that the fix would help various people
> so we should include it.
>
> Read on below for details. Huidae, in case of no objections by
> the others, could you update this in CVS?

> [...]
> > > There's something you should know about Vask.
> > >
> > > V_support.c should have:
> > >
> > > in line 51 the substring:
> > > && (*ANS_PTR < '\177') *
> > > should be deleted.
> > >
> > > V_call.c:
> > >
> > > in line 254:
> > > & 0177
> > > should be deleted.
> > >
> > > in line 410:
> > > && (newchar < '\176')
> > > should be deleted.

> > Yes, actually it helps one type asian characters. So I updated v.digit in
> > this way and added DASIAN_CHARS into head.in some months ago.
> >
> > If others are ok, what about adding --with-asian-chars to configure.in.
> > Without this flag, i have to add DASIAN_CHARS to head.* by hand after
> > configure.

I don't see any point in making this optional; if GRASS copes with
8-bit characters, then it may as well be unconditional.

Presumably this applies to all non-ASCII character sets, not just
Asian ones (many people would assume that ASIAN_CHARS refers to
multi-byte encodings such as ISO-2022).

BTW, I noticed this in V_call.c:

  c = getch() & 0177;

This is the wrong thing to do, for several reasons:

1. As EOF is usually -1, the "&" is going to convert it to \177, so
the "if(c == EOF)" test on the following line will fail.

2. If someone enters a top-bit-set character, and GRASS can't cope
with them, the correct solution is to reject it, not to silently
convert it to some completely unrelated character.

3. GRASS *ought* to be able to cope with it. If there are bugs related
to char signed-ness, let's fix them rather than introducing a
workaround which ensures that they'll never be found.

Also, "newchar" should probably be an "int"; this avoids the issue of
top-bit-set characters being negative. "int" variables are also likely
to be more efficient than "char" (or, for that matter, "short").

Hi all,

Glynn, you're right. So I added '#ifdef ASIAN_CHARS' to input an exceptional
characters such as asian chars. It does not affect any behaviour without
DASIAN_CHARS in head.$(ARCH).
For this, I didn't change var types of "newchar" and etc.

Huidae; do you want to deal with this, or shall I? For now, I'll leave
it alone unless told otherwise.

I don't know the side effects of this ASIAN_CHARS fix exactly. It's
just test code because I didn't see where >128 char is used for now.

So it would be nice to leave it untouched until this fix turns out ok.

Yours,
Huidae Cho

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Huidae Cho wrote:

> Huidae; do you want to deal with this, or shall I? For now, I'll leave
> it alone unless told otherwise.

I don't know the side effects of this ASIAN_CHARS fix exactly. It's
just test code because I didn't see where >128 char is used for now.

So it would be nice to leave it untouched until this fix turns out ok.

Are you satisfied with it yet?

We are approaching release, so this needs to be fixed soon.

--
Glynn Clements <glynn.clements@virgin.net>

Huidae Cho wrote:

> > > Huidae; do you want to deal with this, or shall I? For now, I'll leave
> > > it alone unless told otherwise.
> >
> > I don't know the side effects of this ASIAN_CHARS fix exactly. It's
> > just test code because I didn't see where >128 char is used for now.
> >
> > So it would be nice to leave it untouched until this fix turns out ok.
>
> Are you satisfied with it yet?
>
> We are approaching release, so this needs to be fixed soon.

Hmm, are you lost my update?
I've already implemented ./configure flag, --with-asian-chars,
to switch on multiple-byte characters. I named this flag as
--with-asian-chars because i'm not sure if this flag works for all
multiple-byte characters.

I wasn't aware that this had anything to do with multi-byte
characters; I thought that it was just 8-bit characters. Looking at
the code (i.e. all occurrences of "ASIAN_CHARS") indictates that it
*is* just 8-bit characters.

Currently, you have a choice of bugs, depending upon whether the
ASIAN_CHARS macro is defined.

1. If ASIAN_CHARS is not defined, then there are several places where
an 8-bit character is silently "coerced" into the 7-bit range by
ANDing with 0177 (0x7F).

If GRASS can't cope with 8-bit characters, they should be ignored, or
an error generated, or translated to a specific character (e.g. '?').
They definitely shouldn't be converted to some arbitrary unrelated
character.

2. If ASIAN_CHARS is defined, then there are several places where a
"less than zero" test is used on a "char". This assumes that "char" is
signed, which isn't guaranteed by ANSI C.

If you want to perform comparisons on 8-bit characters, you *cannot*
use "char"; you *must* explicitly specify "signed char" or "unsigned
char".

Basically I'm trying to determine whether both cases need to be fixed,
or if one of the cases can be scrapped. The only reason for having
both is if the 8-bit version is too unreliable for general use but
still useful as an option.

It would be great if you test this flag for your multiple-byte characters.
Then please inform me if this works correctly.

The only reliable testing will come from people who actually make real
use of GRASS. I'm not one of them; I'm a computer programmer, not a
geographer, cartographer or similar. And even if I was, as a native
English speaker I don't normally use non-ASCII characters (other than
the occasional pound (sterling) sign).

Personally I would prefer it if vasklib at least was 8-bit clean. The
only way in which we will discover bugs in the handling of 8-bit
characters in individual programs is if users have some means of
entering them.

For this reason, I would suggest that the conditionals be removed,
with the 8-bit (ASIAN_CHARS) case remaining. The old code simply
masked any problems (and didn't do a particularly good job at that).
In the absence of anyone giving reasons to the contrary, I'll get onto
this.

BTW, I've also spotted a couple of other problems in vasklib which
need fixing (e.g. assigning the result of wgetch() to a "char"
variable). While I'm at it, I may as well look into fixing the
handling of extended keys.

PS: Anyone interested in portability testing might consider adding
"-funsigned-char" to their CFLAGS.

--
Glynn Clements <glynn.clements@virgin.net>