[GRASS-dev] man pages in UTF-8

hi,

for Fedora and other distros UTF-8 encoding of manual pages is required.
How about changing all HTML files to UTF-8 (I can do that)?
Any side effects to be expected?

Markus

Markus Neteler wrote:

for Fedora and other distros UTF-8 encoding of manual pages is required.
How about changing all HTML files to UTF-8 (I can do that)?
Any side effects to be expected?

It would be better to change the HTML source files to use entities
rather than any particular encoding. Currently, they use a mix of
ISO-8859-1 and UTF-8 (those which use UTF-8 won't show correctly,
because the files are treated as being in ISO-8859-1). In 7.0, the
only <module>.html files containing non-ASCII characters are:

  i.evapo.pt
  i.landsat.acca
  r.external.out
  r.sun
  r.walk

No modules appear to use non-ASCII characters in their
--html-description output (at least, not for the "C" locale).

mkhtml.py just copies the bytes verbatim, but it adds a "meta" tag to
the output indicating that the data is in ISO-8859-1:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

g.html2man will need an option to select the output encoding (UTF-8 is
a GNU groff extension), and will need to convert the output to that
encoding; for UTF-8, it needs to add a byte order mark to file so that
preconv recognises it as UTF-8.

The changes will be simpler if the input is in ASCII or ISO-8859-1.
They will be more complex if HTML files are allowed to use characters
outside of the Latin-1 repertoire (currently, this only affects
i.atcorr, which uses "&lambda;", which ends up as "&#955;" in the
manual page).

--
Glynn Clements <glynn@gclements.plus.com>

Hi,

2012/3/7 Glynn Clements <glynn@gclements.plus.com>:

The changes will be simpler if the input is in ASCII or ISO-8859-1.

well, and what supporting multi-lingual help system in G7, manual
pages in different languages. Than ASCII wouldn't be probably enough
:wink:

Martin

--
Martin Landa <landa.martin gmail.com> * http://geo.fsv.cvut.cz/~landa