[GRASS-dev] [GRASS GIS] #2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has accents in path

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------------------+-------------------------
Reporter: mlennert | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Keywords: location wizard encoding | CPU: Unspecified
Platform: Unspecified |
--------------------------------------+-------------------------
In trunk (but not int release70), testing with a grass database located in
/home/mlennert/DonnéesGRASS, I get the following error when trying to
create a new location with the location creation wizard. The error appears
when I click on the 'Finish' button, and the location is not created.

{{{
Traceback (most recent call last):
   File "/home/mlennert/SRC/GRASS/grass_trunk/dist.x86_64-pc-linux-
gnu/gui/wxpython/gis_set.py", line 510, in OnWizard
     grassdatabase = self.tgisdbase.GetValue())
   File "/home/mlennert/SRC/GRASS/grass_trunk/dist.x86_64-pc-linux-
gnu/gui/wxpython/location_wizard/wizard.py", line 1918, in __init__
     msg = self.OnWizFinished()
   File "/home/mlennert/SRC/GRASS/grass_trunk/dist.x86_64-pc-linux-
gnu/gui/wxpython/location_wizard/wizard.py", line 2067, in OnWizFinished
     current_gdb = grass.gisenv()['GISDBASE'].decode(sys.stdin.encoding)
   File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
     return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 19: ordinal not in range(128)
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by marisn):

This seems to be an side effect of r64044 while fixing #2205 Time for
another band-aid?
See: https://trac.osgeo.org/grass/ticket/2205#comment:5

How are you running GRASS - from a terminal or launching from a icon (no
terminal)?
If from terminal, what is the output of locale?

Some reading material:
https://github.com/spyder-ide/spyder/issues/2004
http://www.thecodingforums.com/threads/getting-the-encoding-of-sys-stdout-
and-sys-stdin-and-changing-it-properly.353023/
http://www.thecodingforums.com/threads/how-does-python-get-the-value-for-
sys-stdin-encoding.730678/
and also this one https://mail.python.org/pipermail/python-
list/2010-August/584524.html

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:1&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by mlennert):

Replying to [comment:1 marisn]:
> This seems to be an side effect of r64044 while fixing #2205 Time for
another band-aid?
> See: https://trac.osgeo.org/grass/ticket/2205#comment:5
>
> How are you running GRASS - from a terminal or launching from a icon (no
terminal)?
> If from terminal, what is the output of locale?

terminal

{{{
$ locale
LANG=fr_BE.utf8
LANGUAGE=
LC_CTYPE="fr_BE.utf8"
LC_NUMERIC="fr_BE.utf8"
LC_TIME="fr_BE.utf8"
LC_COLLATE="fr_BE.utf8"
LC_MONETARY="fr_BE.utf8"
LC_MESSAGES="fr_BE.utf8"
LC_PAPER="fr_BE.utf8"
LC_NAME="fr_BE.utf8"
LC_ADDRESS="fr_BE.utf8"
LC_TELEPHONE="fr_BE.utf8"
LC_MEASUREMENT="fr_BE.utf8"
LC_IDENTIFICATION="fr_BE.utf8"
LC_ALL=
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:2&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by marisn):

I think I see where is the problem. As usual - GRASS developers are not
following Unicode best practice (decode early, encode late) and ignoring
Pythonic approach in Python code (f* byte strings, use Unicode internally
everywhere. Yes, everywhere. No exceptions).

A correct approach would be to decode any incoming strings as soon as they
enter Python code i.e. in core.read_command (see patch), still that will
cause a huge breakage of whole wxgui till all str() occurrences (also
implicit ones!) will be transformed to unicode() ones (as in the example
patch for dbmgr).

Here is a hack around the issue (without breaking #2205) but I consider it
to be a wrong way to go:
{{{
Index: gui/wxpython/location_wizard/wizard.py

--- gui/wxpython/location_wizard/wizard.py (revision 67730)
+++ gui/wxpython/location_wizard/wizard.py (working copy)
@@ -2064,7 +2064,12 @@
              return None

          # current GISDbase or a new one?
- current_gdb =
grass.gisenv()['GISDBASE'].decode(sys.stdin.encoding)
+ current_gdb = grass.gisenv()['GISDBASE']
+ if isinstance(current_gdb, bytes):
+ ENCODING = locale.getdefaultlocale()[1]
+ if ENCODING is None:
+ ENCODING = 'UTF-8'
+ current_gdb = current_gdb.decode(ENCODING)
          if current_gdb != database:
              # change to new GISDbase or create new one
              if os.path.isdir(database) != True:
}}}
I assume that it makes clear why this is a bad idea.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:3&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------
Changes (by marisn):

* Attachment "decode_early.patch" added.

Decode early. The utils.py part might be not needed IF decoding early is
done everywhere.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------
Changes (by marisn):

* Attachment "str2unicode.patch" added.

An example how to fix code IFF it is decoded early

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.1.0
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by annakrat):

I suggest to bring this up on the mailing list, rather than this ticket
since it's a long standing issue and there are different opinions on this.
I would also like to see how should we deal with these problems in regard
to Python 3.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:4&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.2.1
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by wenzeslaus):

In [changeset:"70336" 70336]:
{{{
#!CommitTicketReference repository="" revision="70336"
wxGUI: use library function for decode of gisdbase string in location
wizard (see #2885 and r70335)
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:7&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.2.1
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by wenzeslaus):

Another possible traceback besides the one reported is:

{{{
Traceback (most recent call last):
   File "/usr/lib/grass72/gui/wxpython/gis_set.py", line 531, in OnWizard
     grassdatabase=self.tgisdbase.GetValue())
   File "/usr/lib/grass72/gui/wxpython/location_wizard/wizard.py", line
2327, in __init__
     msg = self.OnWizFinished()
   File "/usr/lib/grass72/gui/wxpython/location_wizard/wizard.py", line
2486, in OnWizFinished
     current_gdb = grass.gisenv()['GISDBASE'].decode(sys.stdin.encoding)
TypeError: decode() argument 1 must be string, not None
}}}

However, I cannot reproduce this issue on my machine. Unicode in GRASS GIS
db path works. Can someone test it?

If I set everything from `locale` to nothing, I get different error, but
I'm not sure if that's even supposed to work (no encoding set but unicode
in paths).

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:8&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.2.1
Component: wxGUI | Version: svn-trunk
Resolution: | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------

Comment (by mlennert):

Replying to [comment:7 wenzeslaus]:
> In [changeset:"70336" 70336]:
> {{{
> #!CommitTicketReference repository="" revision="70336"
> wxGUI: use library function for decode of gisdbase string in location
wizard (see #2885 and r70335)
> }}}

This solves the issue for me in trunk.

Can this be backported to 7.2 (where I still get the error) ?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:9&gt;
GRASS GIS <https://grass.osgeo.org>

#2885: wxGUI Location Creation Wizard: UnicodeEncoreError when GISDBASE has
accents in path
--------------------------+--------------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: normal | Milestone: 7.2.4
Component: wxGUI | Version: svn-trunk
Resolution: fixed | Keywords: location wizard encoding
       CPU: Unspecified | Platform: Unspecified
--------------------------+--------------------------------------
Changes (by mlennert):

* status: new => closed
* resolution: => fixed

Comment:

As 7.4 is now the stable release and 7.6 is coming closer, I think we can
close this bug and declare it a nofix for 7.2.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2885#comment:14&gt;
GRASS GIS <https://grass.osgeo.org>