[GRASS-dev] [GRASS GIS] #2532: TypeError: environment can only contain string when launching script on Windows

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------
When launching python script in GUI - File - Launch script, I am asked to
add the path to `GRASS_ADDON_PATH`. I did it and ran the script
successfully. However, I am not able to run any command afterwards because
of the python error (TypeError: environment can only contain string). The
problem is the script path is unicode type (although I am using only ascii
letters). The solution is to encode the script path, but with which
encoding? And how it is going to be decoded?

A temporary solution is to reject any scripts with path with non-ascii
letters and just use `str()`.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [ticket:2532 annakrat]:

> The problem is the script path is unicode type (although I am using only
ascii letters).

wxPython uses Unicode for almost everything. So retrieving the contents of
a text field will return a Python unicode value.

> The solution is to encode the script path, but with which encoding? And
how it is going to be decoded?

It won't be decoded. The byte string will be available to the called
program as a char* via getenv() (for C) or os.environ (Python).

> A temporary solution is to reject any scripts with path with non-ascii
letters and just use `str()`.

wxGUI's core.gcmd module has EncodeString() and DecodeString() methods
which use whatever wxGUI considers to be the "system" encoding. Those are
used by gcmd.Popen for converting the arguments to strings and by
gcmd.RunCommand() for converting the process' output to unicode.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:1 glynn]:
> Replying to [ticket:2532 annakrat]:
>
> wxGUI's core.gcmd module has EncodeString() and DecodeString() methods
which use whatever wxGUI considers to be the "system" encoding. Those are
used by gcmd.Popen for converting the arguments to strings and by
gcmd.RunCommand() for converting the process' output to unicode.

OK, I used EncodeString, but then with non-ascii characters I get (ascii
only path works fine now):

{{{
Traceback (most recent call last):
   File "C:\Users\akratoc\Programs\GRASS GIS
7.0.0svn\gui\wxpython\lmgr\frame.py", line 842, in
OnRunScript

filename = EncodeString(filename)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.0.0svn\gui\wxpython\core\gcmd.py", line 101, in
EncodeString

return string.encode(_enc)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.0.0svn\Python27\lib\encodings\cp1252.py", line 12, in
encode

return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError
:
'charmap' codec can't encode character u'\u0165' in position
40: character maps to <undefined>
}}}

I have seen this error in several other tickets, is there something we can
do about it?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

I see what you were writing in #2525. So should we just catch an exception
and say the user, sorry, don't use non ascii characters in the script path
(and change your operating system)?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:3 annakrat]:
> I see what you were writing in #2525. So should we just catch an
exception and say the user, sorry, don't use non ascii characters in the
script path (and change your operating system)?

It's not "non-ASCII" characters per se, it's characters which aren't
representable in your system codepage (configurable on Windows 7 via
Control Panel -> Region and Language -> Administrative -> Change system
locale ...).

For Western European languages, the system locale's encoding will be
[http://en.wikipedia.org/wiki/Cp1252#Code_page_layout cp1252], which is
basically [http://en.wikipedia.org/wiki/ISO-8859-1#Codepage_layout
ISO-8859-1] but with most of the C1 control codes (\x80-\x9f) remapped to
additional graphic characters.

U+0165 is present in
[http://en.wikipedia.org/wiki/Windows-1250#Code_page_layout cp1250]
(Eastern European, similar to ISO-8859-2).

It appears that Windows has a mechanism for approximating accented
characters; if I create a directory whose name contains that character,
the "dir" command (in a console using cp1252) shows the directory with the
character replaced by "t", and I can "cd" into the directory.
Unfortunately, this feature doesn't appear to be accessible via Python.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

I used `EncodeString` in r63997, r63998. I tested it successfully on
Windows (cp1252) with ascii characters and non-ascii characters which are
not present in cp1252 result in error dialog with message how to avoid
that. However, I failed to run the script when the name contained non-
ascii characters present in cp1252 (á). I don't get any error, but in gui
console I get:

{{{
Launching script 'C:\Users\akratoc\Desktop\test_workshopá.py'...
(Thu Jan 08 12:04:24 2015)
Description:
  Adds the values of two rasters (A + B)
Keywords:
  raster, algebra, sum
Usage:
  test_workshopá.py araster=name braster=name output=name
[--overwrite]
    [--help] [--verbose] [--quiet] [--ui]
Flags:
  --o Allow output files to overwrite existing files
  --h Print usage summary
  --v Verbose module output
  --q Quiet module output
  --ui Force launching GUI dialog
Parameters:
   araster Name of input raster A in an expression A + B
   braster Name of input raster B in an expression A + B
    output Name for output raster map
ERROR: Required parameter <araster> not set:
         (Name of input raster A in an expression A + B)
ERROR: Required parameter <braster> not set:
         (Name of input raster B in an expression A + B)
ERROR: Required parameter <output> not set:
         (Name for output raster map)
(Thu Jan 08 12:04:25 2015) Command finished (0 sec)
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------
Changes (by annakrat):

  * priority: major => normal

Comment:

I backported r63997, r63998 in r64102. Now it's working at least with
ascii characters on the path.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:5 annakrat]:
> However, I failed to run the script when the name contained non-ascii
characters present in cp1252 (á). I don't get any error, but in gui
console I get:
>
{{{
Launching script 'C:\Users\akratoc\Desktop\test_workshopá.py'...
}}}
Is the "..." literal? I.e. does the GUI omit the arguments, or does it
include details which have been omitted from the ticket?

>
{{{
ERROR: Required parameter <araster> not set:
}}}

Can you get any more debug output?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:7&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:7 glynn]:
> Replying to [comment:5 annakrat]:
> > However, I failed to run the script when the name contained non-ascii
characters present in cp1252 (á). I don't get any error, but in gui
console I get:
> >
> {{{
> Launching script 'C:\Users\akratoc\Desktop\test_workshopá.py'...
> }}}
> Is the "..." literal? I.e. does the GUI omit the arguments, or does it
include details which have been omitted from the ticket?

That comes from
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/lmgr/frame.py#L906
here], there are no details, it's ran without any arguments.
>
> >
> {{{
> ERROR: Required parameter <araster> not set:
> }}}
>
> Can you get any more debug output?
Will try.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:8&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:8 annakrat]:
> Replying to [comment:7 glynn]:
> > Can you get any more debug output?
> Will try.
With debug messages on I get in the GUI console:

{{{
Launching script 'C:\Users\akratoc\Desktop\test_workshopá.py'...
(Thu Jan 08 12:04:24 2015)
C:\Users\akratoc\Desktop\test_workshopá.py
D2/5: filename = C:\Users\akratoc\Desktop\test_workshopá.py
D1/5: G_set_program_name(): test_workshopá
D2/5: G_file_name(): path =
C:\Users\akratoc\grassdata/nc_basic_spm_grass7/user1
Description:
... and the same as above
}}}

and in the terminal window:

{{{
GUI D5/5: EncodeString(): enc=cp1252
D1/5: grass.script.core.start_command(): g.gisenv -n
D1/5: G_set_program_name(): g.gisenv
D2/5: G_option_to_separator(): key = separator -> sep = '
'
GUI D1/5: gcmd.CommandThread(): C:\Users\akratoc\Desktop\test_workshopá.py
GUI D5/5: EncodeString(): enc=cp1252
GUI D5/5: EncodeString(): enc=cp1252
}}}

It doesn't seem particularly helpful but I don't know what else I can do.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:9&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:8 annakrat]:

> That comes from
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/lmgr/frame.py#L906
here], there are no details, it's ran without any arguments.

I see.

It's executing the script, which is executing g.parser, which is reading
the option definitions from the script then calling G_parser(). As it's
called without arguments, G_parser() should be generating a GUI dialog,
but it's not even attempting to do that; it's falling through to the
option-checking code.

AFAICT, in order for that error message to occur, either argc would have
to be at least 2 or isatty(0) would have to be false. But if argc >= 2,
that would result in the value of argv[1] being used as the value for
araster= (even if it's an empty string), which would prevent the "Required
parameter <araster> not set" error.

Which leaves isatty(0) being false. But that shouldn't have anything to do
with whether the script filename contains non-ASCII characters. it might
be something to do with wxGUI, or it might be Windows weirdness.

Can you add the following to the script, before the call to
grass.parser():
{{{
import os
print os.isatty(0)
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:10&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:10 glynn]:
> Replying to [comment:8 annakrat]:
>
> Can you add the following to the script, before the call to
grass.parser():
  {{{
  import os
  print os.isatty(0)
  }}}

It gives me False. I will try to see if there is something wrong in the
gui part.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:11&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:11 annakrat]:

> It gives me False.

Presumably that's only the case when the script filename has non-ASCII
characters?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:12&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:11 annakrat]:
> Replying to [comment:10 glynn]:
> > Replying to [comment:8 annakrat]:
> >
> It gives me False. I will try to see if there is something wrong in the
gui part.

I found that there is raised and ignored exception
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/gconsole.py#L554
here] and if I remove the try except block, I get:

{{{
Traceback (most recent call last):
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\lmgr\frame.py", line 907, in
OnRunScript

self._gconsole.RunCmd([filename])
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gconsole.py", line 554, in RunCmd

task = gtask.parse_interface(command[0])
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\etc\python\grass\script\task.py", line 509, in
parse_interface

tree = etree.fromstring(get_interface_description(name))
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\etc\python\grass\script\task.py", line 465, in
get_interface_description

stderr=PIPE)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\etc\python\grass\script\core.py", line 62, in
__init__

subprocess.Popen.__init__(self, args, **kwargs)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\subprocess.py", line 711, in __init__

errread, errwrite)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\subprocess.py", line 922, in
_execute_child

args = '{} /c "{}"'.format (comspec, args)
UnicodeEncodeError
'ascii' codec can't encode character u'\xe1' in position 38: ordinal not
in range(128)
}}}

The `command[0]` is Unicode. It seems Popen in Python 2.7 can't handle
non-ascii characters. So I tried to encode the command string and I get
different error:

{{{
Traceback (most recent call last):
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\lmgr\frame.py", line 907, in
OnRunScript

self._gconsole.RunCmd([filename])
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gconsole.py", line 555, in RunCmd

task = gtask.parse_interface(EncodeString(command[0]))
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\etc\python\grass\script\task.py", line 509, in
parse_interface

tree = etree.fromstring(get_interface_description(name))
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\xml\etree\ElementTree.py", line 1300,
in XML

parser.feed(text)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\xml\etree\ElementTree.py", line 1642,
in feed

self._raiseerror(v)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\xml\etree\ElementTree.py", line 1506,
in _raiseerror

raise err
xml.etree.ElementTree
.
ParseError
:
syntax error: line 1, column 0
}}}

It seems that `get_interface_description` returns empty xml. I didn't have
time to look into it further.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:13&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:13 annakrat]:

> The `command[0]` is Unicode. It seems Popen in Python 2.7 can't handle
non-ascii characters.

It's more accurate to say that it can't handle unicode. Or, more
precisely, unicode which cannot be implicitly converted to a string.
Implicit conversions use the default encoding (which is typically ASCII)
rather than the locale's encoding. The default encoding is a system or
user preference and cannot be changed by scripts.

> So I tried to encode the command string and I get different error:
>
{{{
raise err
xml.etree.ElementTree
.
ParseError
}}}

> It seems that get_interface_description returns empty xml

Did you confirm that?

Otherwise, my guess is that the XML is invalid due to encoding issues.

The program name is copied verbatim into the XML, in the <task name="...">
tag.

If GRASS was built with iconv support, the declared encoding of the XML
will be UTF-8; text nodes will be convert from the locale's encoding to
UTF-8 (and <,>,& will be converted to entities), but attribute values
aren't converted:

{{{
     fprintf(stdout, "<task name=\"%s\">\n", st->pgm_name);
}}}

So, they need to be restricted to the intersection of the locale's
encoding and UTF-8 (which probably means ASCII).

I'm not sure that it's worth trying to support script names which contain
non-ASCII characters. However, scripts in directories whose names contain
non-ASCII characters need to be supported. The same applies to other
files; e.g. we can reasonably restrict map, mapset and location names to
ASCII, but we should support the situation where the database path
contains non-ASCII characters.

In any case, the GUI should be encoding the arguments which it passes to
Popen(); it shouldn't be passing unicode values.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:14&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:14 glynn]:
> Replying to [comment:13 annakrat]:
> > So I tried to encode the command string and I get different error:
> >
> {{{
> raise err
> xml.etree.ElementTree
> .
> ParseError
> }}}
>
> > It seems that get_interface_description returns empty xml
>
> Did you confirm that?

No, when I print the string I get xml, seems to be valid:

{{{
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE task SYSTEM "C:\Users\akratoc\Programs\GRASS GIS 7.1.svn\gui\xml
\grass-interface.dtd">
<task name="test_workshopá.py">
         <description>
                 Adds the values of two rasters (A + B)
         </description>
...
}}}
I don't understand what's wrong with it.

>
> Otherwise, my guess is that the XML is invalid due to encoding issues.
>
> The program name is copied verbatim into the XML, in the <task
name="..."> tag.
>
> If GRASS was built with iconv support, the declared encoding of the XML
will be UTF-8; text nodes will be convert from the locale's encoding to
UTF-8 (and <,>,& will be converted to entities), but attribute values
aren't converted:
>
> {{{
> fprintf(stdout, "<task name=\"%s\">\n", st->pgm_name);
> }}}
>
> So, they need to be restricted to the intersection of the locale's
encoding and UTF-8 (which probably means ASCII).
>
> I'm not sure that it's worth trying to support script names which
contain non-ASCII characters. However, scripts in directories whose names
contain non-ASCII characters need to be supported. The same applies to
other files; e.g. we can reasonably restrict map, mapset and location
names to ASCII, but we should support the situation where the database
path contains non-ASCII characters.
>
> In any case, the GUI should be encoding the arguments which it passes to
Popen(); it shouldn't be passing unicode values.

Should the be encoding moved to `get_interface_description` in task.py?
The `EncodeString` function is in gui, not in python scripting library.

If I try to run the script (this time the script name is only ascii, but
the path has some non-ascii characters which are in cp1252), I get the gui
dialog and when I run it, I get an error:

{{{
Exception in thread Thread-28:
Traceback (most recent call last):
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\threading.py", line 810, in
__bootstrap_inner
     self.run()
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gconsole.py", line 155, in run
     self.resultQ.put((requestId, self.requestCmd.run()))
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gcmd.py", line 575, in run
     env = self.env)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gcmd.py", line 161, in __init__
     args = map(EncodeString, args)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gcmd.py", line 92, in EncodeString
     return string.encode(_enc)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\encodings\cp1252.py", line 12, in
encode
     return
codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in
position 38: ordinal not in range(128)
}}}

because in Popen class in
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/gcmd.py#L161
gcmd.py] some of the arguments are of type `str`, some are `unicode`. So
if encode only the unicode ones, it starts to work.

{{{
             for i in range(len(args)):
                 if type(args[i]) != str:
                     args[i] = EncodeString(args[i])
}}}

So I am not sure what should I do with these results.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:15&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:15 annakrat]:

> No, when I print the string I get xml, seems to be valid:
>
{{{
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE task SYSTEM "C:\Users\akratoc\Programs\GRASS GIS 7.1.svn\gui\xml
\grass-interface.dtd">
<task name="test_workshopá.py">
}}}
> I don't understand what's wrong with it.

The name= attribute will fail to decode due to not being valid UTF-8. The
"á" will be encoded in cp1252 (i.e. '\xe1'); attempting to decode that as
UTF-8 will fail (non-ASCII characters are encoded as multi-byte sequences;
an isolated byte >= 128 can never occur in UTF-8).

> > In any case, the GUI should be encoding the arguments which it passes
to Popen(); it shouldn't be passing unicode values.
>
> Should the be encoding moved to `get_interface_description` in task.py?

No. The GUI shouldn't be passing unicode values to the grass.script
library; it should be converting them to strings itself.

> The `EncodeString` function is in gui, not in python scripting library.

grass.script.core has encode() and decode().

> If I try to run the script (this time the script name is only ascii, but
the path has some non-ascii characters which are in cp1252), I get the gui
dialog and when I run it, I get an error:
>
{{{
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\wxpython\core\gcmd.py", line 92, in EncodeString
     return string.encode(_enc)
   File "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\Python27\lib\encodings\cp1252.py", line 12, in
encode
     return
codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in
position 38: ordinal not in range(128)
}}}

Ugh. I couldn't figure out what was happening here until I read the next
sentence. It appears that str.encode() actually exists; it tries to
convert the string to unicode (using the default encoding) so that it can
encode it.

> because in Popen class in
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/gcmd.py#L161
gcmd.py] some of the arguments are of type `str`, some are `unicode`. So
if encode only the unicode ones, it starts to work.

That makes sense. But the encoding should ideally be done at a higher
level, at the point that wxGUI "knows" that it's dealing with a unicode
value.

This is the main reason why I dislike dynamically-typed languages for
large-scale projects (I'd never have suggested Python if I'd have known
that wxGUI was going to turn into such a behemoth). In C/C++, you'd just
get a compile error if you pass a wchar_t*/std::wstring() where a
char*/std::string was expected. In Python, you get something which appears
to work until it starts getting decent test coverage.

I'm wondering if sys.setdefaultencoding("EBCDIC-CP-BE") would work ...

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:16&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by annakrat):

Replying to [comment:16 glynn]:
> Replying to [comment:15 annakrat]:
>
> > No, when I print the string I get xml, seems to be valid:
> >
> {{{
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE task SYSTEM "C:\Users\akratoc\Programs\GRASS GIS
7.1.svn\gui\xml\grass-interface.dtd">
> <task name="test_workshopá.py">
> }}}
> > I don't understand what's wrong with it.
>
> The name= attribute will fail to decode due to not being valid UTF-8.
The "á" will be encoded in cp1252 (i.e. '\xe1'); attempting to decode that
as UTF-8 will fail (non-ASCII characters are encoded as multi-byte
sequences; an isolated byte >= 128 can never occur in UTF-8).

I take it that we are supporting only ascii characters in the script name.
>
> > > In any case, the GUI should be encoding the arguments which it
passes to Popen(); it shouldn't be passing unicode values.
> >
> > Should the be encoding moved to `get_interface_description` in
task.py?
>
> No. The GUI shouldn't be passing unicode values to the grass.script
library; it should be converting them to strings itself.

Ok.
>
> > The `EncodeString` function is in gui, not in python scripting
library.
>
> grass.script.core has encode() and decode().
>
> > If I try to run the script (this time the script name is only ascii,
but the path has some non-ascii characters which are in cp1252), I get the
gui dialog and when I run it, I get an error:
>
> Ugh. I couldn't figure out what was happening here until I read the next
sentence. It appears that str.encode() actually exists; it tries to
convert the string to unicode (using the default encoding) so that it can
encode it.
>
> > because in Popen class in
[http://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/gcmd.py#L161
gcmd.py] some of the arguments are of type `str`, some are `unicode`. So
if encode only the unicode ones, it starts to work.
>
> That makes sense. But the encoding should ideally be done at a higher
level, at the point that wxGUI "knows" that it's dealing with a unicode
value.

I am not sure where the higher level is and why str and unicode are mixed
in this case.
>
>
> I'm wondering if sys.setdefaultencoding("EBCDIC-CP-BE") would work ...

Why would it? Is it easy to test?

Anyway, I think whatever we do, shouldn't get into the current release. I
already fixed the important part (works with ascii path only) and I don't
want to make things worse.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:17&gt;
GRASS GIS <http://grass.osgeo.org>

#2532: TypeError: environment can only contain string when launching script on
Windows
-------------------------+--------------------------------------------------
Reporter: annakrat | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Default | Version: svn-trunk
Keywords: encoding | Platform: MSWindows 8
      Cpu: Unspecified |
-------------------------+--------------------------------------------------

Comment(by glynn):

Replying to [comment:17 annakrat]:

> > That makes sense. But the encoding should ideally be done at a higher
level, at the point that wxGUI "knows" that it's dealing with a unicode
value.
>
> I am not sure where the higher level is and why str and unicode are
mixed in this case.

Unicode values typically come from wxWidgets, e.g. any text retrieved from
a text field will be a unicode object.

> > I'm wondering if sys.setdefaultencoding("EBCDIC-CP-BE") would work ...
>
> Why would it? Is it easy to test?

Sorry, that was really just thinking out loud. It wouldn't fix anything,
it would just highlight any remaining implicit conversions.

EBCDIC (used on IBM mainframes) is one of the few encodings which
[b]isn't[/b] compatible (or even mostly-compatible) with ASCII. Setting
the default encoding to EBCDIC would make it obvious when implicit
str<->unicode conversions were being performed, because the results would
be completely wrong (e.g. even A-Z/a-z don't have the same codepoints as
ASCII).

The default encoding can only be set in site.py; site.py deletes the
setdefaultencoding() function from the sys module to prevent the default
encoding from being changed after start-up.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2532#comment:18&gt;
GRASS GIS <http://grass.osgeo.org>