[Geoserver-devel] Translations [was] Re: i18n : File encoding for Java properties files

Hey, so at OpenGeo we’ve been having some good experiences with Transifex, using it for our GeoNode project.

We’re still in a pretty sad state of translations relative to 1.x (I think we had 7 or 8 by the time we closed out 1.7.x, and we’re at 4 right now in 2.x), and I think the type of tooling offered by Transifex could greatly improve our coverage.

I was about to dig in to setting it up, when I noticed this thread, that it looks like Frank already has done so. See https://www.transifex.com/projects/p/geoserver_22x/

Frank, could you add me as an admin for the geoserver project? I’m https://www.transifex.com/accounts/profile/cholmes/

I think the most important step is to update the documentation - http://docs.geoserver.org/stable/en/developer/translation.html - with telling users to just use transifex. And then we need to explain how to go from transifex resources to pull requests (Jeff says the transifex commandline tools can help with this a lot, making git pull requests for you).

Then a nice blog post calling for translators. I think with that we could get a lot of people translating if we set this up right.

On Sun, Jan 22, 2012 at 1:57 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Sun, Jan 22, 2012 at 8:32 PM, Frank Gasdorf <fgdrf@anonymised.com> wrote:

Hello List,

I’d like to discuss, how to handle properties files for geoserver.

In the past, Christian
(http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg12071.html)
already started the discussion about UTF8 vs. some other encodings a
while ago. While I has been working on German translations in the last
weeks and month I was in close dialogue with translation specialists
from transifex.net.

Summarizing this communication:

I suggest to switch over to the ISO Standard encoding for properties
files. I guess eclipse also uses per default ISO 8559-1, if the
developer uses the Action “externalize Strings”. IMHU it would be
easier to work with third-party tools like transifex, that explicit
working on the Java standards.

BTW, if all characters are encoded by \uXXXX sequences, everything
would be fine in the future.

As far as I know writing the property files in some languages (japanese,
chinese) in ISO 8859-1 with escape codes can be really hard,
but afaik in those case the xml format for property files is the
preferred one.

I’m not up to speed with what is going on with translations, is
everyone doing translations using this transifex.net site?

As for coding, the coders only add strings to the main (english)
property file afaik (well, maybe Christian adds to the German one
as well, not sure), and we do so manually, the Eclipse code to externalize
the strings is of little help, does not work with our customized Wicket
i18n subclasses (nor it would work with the standard ones afaik).

Anyways, I don’t mind about the encoding, both work for me, provided
that we reach to an agreement that works for all translators.

Speaking of which… who’s doing translations and how?
As far as I can see the devs are doing the initial work in English,
then there is you and Ives that cover German and French respectively?
Anyone else?

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Ok, Frank’s given me permissions and am playing around with this. Am quickly hitting the limits of my translation / localization knowledge. I have a question, which may in fact be what Frank was originally getting at in this thread.

I am using transifex and getting the files from there, and when I do a diff on what it produces I get a lot of changes, most of them more or less like:

-AbstractCoverageStorePage.description = Descripci\u00F3n
+AbstractCoverageStorePage.description = Descripcin

I think this is because it is ISO-8859 instead of ASCII English.

So my questions

  • Will this style work in GeoServer admin?

If yes,

  • How would we feel about switching over to this style?

If no,

  • Frank, did you figure out a way for transifex to get it in to GeoServer’s style?

On switching over, the argument in this favor is that doing so will make it so even users could contribute to GeoServer translations, with no knowledge of code. And then it’d just take a few minutes from a developer to create a git pull request, and then even less time for a committer to review and pull it in.

On Wed, Oct 17, 2012 at 1:20 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Hey, so at OpenGeo we’ve been having some good experiences with Transifex, using it for our GeoNode project.

We’re still in a pretty sad state of translations relative to 1.x (I think we had 7 or 8 by the time we closed out 1.7.x, and we’re at 4 right now in 2.x), and I think the type of tooling offered by Transifex could greatly improve our coverage.

I was about to dig in to setting it up, when I noticed this thread, that it looks like Frank already has done so. See https://www.transifex.com/projects/p/geoserver_22x/

Frank, could you add me as an admin for the geoserver project? I’m https://www.transifex.com/accounts/profile/cholmes/

I think the most important step is to update the documentation - http://docs.geoserver.org/stable/en/developer/translation.html - with telling users to just use transifex. And then we need to explain how to go from transifex resources to pull requests (Jeff says the transifex commandline tools can help with this a lot, making git pull requests for you).

Then a nice blog post calling for translators. I think with that we could get a lot of people translating if we set this up right.

On Sun, Jan 22, 2012 at 1:57 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Sun, Jan 22, 2012 at 8:32 PM, Frank Gasdorf <fgdrf@anonymised.com> wrote:

Hello List,

I’d like to discuss, how to handle properties files for geoserver.

In the past, Christian
(http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg12071.html)
already started the discussion about UTF8 vs. some other encodings a
while ago. While I has been working on German translations in the last
weeks and month I was in close dialogue with translation specialists
from transifex.net.

Summarizing this communication:

I suggest to switch over to the ISO Standard encoding for properties
files. I guess eclipse also uses per default ISO 8559-1, if the
developer uses the Action “externalize Strings”. IMHU it would be
easier to work with third-party tools like transifex, that explicit
working on the Java standards.

BTW, if all characters are encoded by \uXXXX sequences, everything
would be fine in the future.

As far as I know writing the property files in some languages (japanese,
chinese) in ISO 8859-1 with escape codes can be really hard,
but afaik in those case the xml format for property files is the
preferred one.

I’m not up to speed with what is going on with translations, is
everyone doing translations using this transifex.net site?

As for coding, the coders only add strings to the main (english)
property file afaik (well, maybe Christian adds to the German one
as well, not sure), and we do so manually, the Eclipse code to externalize
the strings is of little help, does not work with our customized Wicket
i18n subclasses (nor it would work with the standard ones afaik).

Anyways, I don’t mind about the encoding, both work for me, provided
that we reach to an agreement that works for all translators.

Speaking of which… who’s doing translations and how?
As far as I can see the devs are doing the initial work in English,
then there is you and Ives that cover German and French respectively?
Anyone else?

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Hi,

Good news, transifex will help a lot to keep up with translations.

With time boxed release schema, a time window for translations could
be planned, after feature freeze and before final release.

We need a systematic way to map files in github with resources in transifex.

I prepared a script that uses tx cli to generate the mapping automatically.
Resource names on transifex are calculated from properties file
locations in source repo:

https://gist.github.com/3918745

Resulting transifex repo looks like this (26 translation files
detected; 5 languages):
https://www.transifex.com/projects/p/geoserver_test/

Regarding character encoding:

I am using transifex and getting the files from there, and when I do a diff
on what it produces I get a lot of changes, most of them more or less like: [...]
I _think_ this is because it is ISO-8859 instead of ASCII English.

The official encoding for java properties files is ISO-8859-1.
Characters not supported by ISO are represented as escaped unicode \uXXXX.
Non-ascii latin characters can be represented both ways.

I just built geoserver with transifex modified files, and spanish
translation looks good.

Oscar.

2012/10/18 Chris Holmes <cholmes@anonymised.com>:

Ok, Frank's given me permissions and am playing around with this. Am quickly
hitting the limits of my translation / localization knowledge. I have a
question, which may in fact be what Frank was originally getting at in this
thread.

So my questions

* Will this style work in GeoServer admin?

If yes,

* How would we feel about switching over to this style?

If no,

* Frank, did you figure out a way for transifex to get it in to GeoServer's
style?

On switching over, the argument in this favor is that doing so will make it
so even users could contribute to GeoServer translations, with no knowledge
of code. And then it'd just take a few minutes from a developer to create a
git pull request, and then even less time for a committer to review and pull
it in.

On Wed, Oct 17, 2012 at 1:20 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Hey, so at OpenGeo we've been having some good experiences with Transifex,
using it for our GeoNode project.

We're still in a pretty sad state of translations relative to 1.x (I think
we had 7 or 8 by the time we closed out 1.7.x, and we're at 4 right now in
2.x), and I think the type of tooling offered by Transifex could greatly
improve our coverage.

I was about to dig in to setting it up, when I noticed this thread, that
it looks like Frank already has done so. See
https://www.transifex.com/projects/p/geoserver_22x/

Frank, could you add me as an admin for the geoserver project? I'm
https://www.transifex.com/accounts/profile/cholmes/

I think the most important step is to update the documentation -
http://docs.geoserver.org/stable/en/developer/translation.html - with
telling users to just use transifex. And then we need to explain how to go
from transifex resources to pull requests (Jeff says the transifex
commandline tools can help with this a lot, making git pull requests for
you).

Then a nice blog post calling for translators. I think with that we could
get a lot of people translating if we set this up right.

On Sun, Jan 22, 2012 at 1:57 PM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Sun, Jan 22, 2012 at 8:32 PM, Frank Gasdorf
<fgdrf@anonymised.com> wrote:

Hello List,

I'd like to discuss, how to handle properties files for geoserver.

In the past, Christian

(http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg12071.html)
already started the discussion about UTF8 vs. some other encodings a
while ago. While I has been working on German translations in the last
weeks and month I was in close dialogue with translation specialists
from transifex.net.

Summarizing this communication:
- standard expected encoding for java properties files readers and
writers is ISO 8859-1 (http://en.wikipedia.org/wiki/ISO/IEC_8859-1),
see see javadoc 1.4
(http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html)
and 6
(http://docs.oracle.com/javase/6/docs/api/java/util/Properties.html)
- "Characters that cannot be directly represented in this encoding can
be written using Unicode escapes" (\uXXXX)

I suggest to switch over to the ISO Standard encoding for properties
files. I guess eclipse also uses per default ISO 8559-1, if the
developer uses the Action "externalize Strings". IMHU it would be
easier to work with third-party tools like transifex, that explicit
working on the Java standards.

BTW, if all characters are encoded by \uXXXX sequences, everything
would be fine in the future.

As far as I know writing the property files in some languages (japanese,
chinese) in ISO 8859-1 with escape codes can be really hard,
but afaik in those case the xml format for property files is the
preferred one.

I'm not up to speed with what is going on with translations, is
everyone doing translations using this transifex.net site?

As for coding, the coders only add strings to the main (english)
property file afaik (well, maybe Christian adds to the German one
as well, not sure), and we do so manually, the Eclipse code to
externalize
the strings is of little help, does not work with our customized Wicket
i18n subclasses (nor it would work with the standard ones afaik).

Anyways, I don't mind about the encoding, both work for me, provided
that we reach to an agreement that works for all translators.

Speaking of which... who's doing translations and how?
As far as I can see the devs are doing the initial work in English,
then there is you and Ives that cover German and French respectively?
Anyone else?

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

On Mon, Oct 22, 2012 at 6:32 AM, Oscar Fonts <oscar.fonts@anonymised.com> wrote:

Hi,

Good news, transifex will help a lot to keep up with translations.

With time boxed release schema, a time window for translations could
be planned, after feature freeze and before final release.

We need a systematic way to map files in github with resources in transifex.

I prepared a script that uses tx cli to generate the mapping automatically.
Resource names on transifex are calculated from properties file
locations in source repo:

https://gist.github.com/3918745

Resulting transifex repo looks like this (26 translation files
detected; 5 languages):
https://www.transifex.com/projects/p/geoserver_test/

Ah cool. I started on this but didn’t get as far.

Do you think I’ll be able to use the same script to set up my tx cli to properly map to the geoserver_test project? I’ll try it out, but I think that’s the goal. I was trying to do it with the projects Frank set up, but it seemed like the commands were different for setting up a new project versus just mapping an existing code repo to an existing code project.

Should I be able to just run that script on a geoserver checkout and then have it wired to the geoserver_test project? And be able to pull down updates from the transifex server? I’ll try it out, but just want to be sure that’s the intention.

Also what are your thoughts on bringing over some of the translations from the geoserver_22x transifex? There’s some progress there on russian, hungarian and norwegian. I suppose we could just create those in the new repo and move the files over.

Regarding character encoding:

I am using transifex and getting the files from there, and when I do a diff

on what it produces I get a lot of changes, most of them more or less like: […]

I think this is because it is ISO-8859 instead of ASCII English.

The official encoding for java properties files is ISO-8859-1.
Characters not supported by ISO are represented as escaped unicode \uXXXX.
Non-ascii latin characters can be represented both ways.

I just built geoserver with transifex modified files, and spanish
translation looks good.

Really? Gabriel and I didn’t successfully do this, and it looked like there might be some problems with how GeoServer reads them in. But perhaps we were doing something else wrong. https://github.com/cholmes/geoserver/commit/182849f4d5216460ad55c264a44e4232f56232f2 is the one that I made with transifex and we were trying to test.

We’d probably want to do some commit that changes all language files over the transifex defaults, no? So that we’d just do a commit that changes a number of them over to iso-8859-1?

Oscar.

2012/10/18 Chris Holmes <cholmes@anonymised.com…1501…>:

Ok, Frank’s given me permissions and am playing around with this. Am quickly
hitting the limits of my translation / localization knowledge. I have a
question, which may in fact be what Frank was originally getting at in this
thread.

So my questions

  • Will this style work in GeoServer admin?

If yes,

  • How would we feel about switching over to this style?

If no,

  • Frank, did you figure out a way for transifex to get it in to GeoServer’s
    style?

On switching over, the argument in this favor is that doing so will make it
so even users could contribute to GeoServer translations, with no knowledge
of code. And then it’d just take a few minutes from a developer to create a
git pull request, and then even less time for a committer to review and pull
it in.

On Wed, Oct 17, 2012 at 1:20 PM, Chris Holmes <cholmes@anonymised.com> wrote:

Hey, so at OpenGeo we’ve been having some good experiences with Transifex,
using it for our GeoNode project.

We’re still in a pretty sad state of translations relative to 1.x (I think
we had 7 or 8 by the time we closed out 1.7.x, and we’re at 4 right now in
2.x), and I think the type of tooling offered by Transifex could greatly
improve our coverage.

I was about to dig in to setting it up, when I noticed this thread, that
it looks like Frank already has done so. See
https://www.transifex.com/projects/p/geoserver_22x/

Frank, could you add me as an admin for the geoserver project? I’m
https://www.transifex.com/accounts/profile/cholmes/

I think the most important step is to update the documentation -
http://docs.geoserver.org/stable/en/developer/translation.html - with
telling users to just use transifex. And then we need to explain how to go
from transifex resources to pull requests (Jeff says the transifex
commandline tools can help with this a lot, making git pull requests for
you).

Then a nice blog post calling for translators. I think with that we could
get a lot of people translating if we set this up right.

On Sun, Jan 22, 2012 at 1:57 PM, Andrea Aime
<andrea.aime@anonymised.com…> wrote:

On Sun, Jan 22, 2012 at 8:32 PM, Frank Gasdorf
<fgdrf@anonymised.com> wrote:

Hello List,

I’d like to discuss, how to handle properties files for geoserver.

In the past, Christian

(http://www.mail-archive.com/geoserver-devel@lists.sourceforge.net/msg12071.html)
already started the discussion about UTF8 vs. some other encodings a
while ago. While I has been working on German translations in the last
weeks and month I was in close dialogue with translation specialists
from transifex.net.

Summarizing this communication:

I suggest to switch over to the ISO Standard encoding for properties
files. I guess eclipse also uses per default ISO 8559-1, if the
developer uses the Action “externalize Strings”. IMHU it would be
easier to work with third-party tools like transifex, that explicit
working on the Java standards.

BTW, if all characters are encoded by \uXXXX sequences, everything
would be fine in the future.

As far as I know writing the property files in some languages (japanese,
chinese) in ISO 8859-1 with escape codes can be really hard,
but afaik in those case the xml format for property files is the
preferred one.

I’m not up to speed with what is going on with translations, is
everyone doing translations using this transifex.net site?

As for coding, the coders only add strings to the main (english)
property file afaik (well, maybe Christian adds to the German one
as well, not sure), and we do so manually, the Eclipse code to
externalize
the strings is of little help, does not work with our customized Wicket
i18n subclasses (nor it would work with the standard ones afaik).

Anyways, I don’t mind about the encoding, both work for me, provided
that we reach to an agreement that works for all translators.

Speaking of which… who’s doing translations and how?
As far as I can see the devs are doing the initial work in English,
then there is you and Ives that cover German and French respectively?
Anyone else?

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

2012/10/22 Chris Holmes <cholmes@anonymised.com>:

Should I be able to just run that script on a geoserver checkout and then
have it wired to the geoserver_test project? And be able to pull down
updates from the transifex server? I'll try it out, but just want to be sure
that's the intention.

Yes, that's the intention.

Also what are your thoughts on bringing over some of the translations from
the geoserver_22x transifex? I suppose we could just create those in the new
repo and move the files over.

Sure. Gave you access to geoserver_test if you want to play.

I understand that the definitive place for translations will still be
github, periodically pulling from transifex, right?

Regarding character encoding:

Really? Gabriel and I didn't successfully do this, and it looked like there
might be some problems with how GeoServer reads them in.

You are right, it works when running jetty from inside Eclipse, but
the web-core translation file is corrupted when building with maven.

After a bit of investigation, the problem is with the
maven-antrun-plugin in web/core/pom.xml
You have to declare the resource files encoding as ISO-8859-1. See commit:
https://github.com/oscarfonts/geoserver/commit/b172716e86536ddfad8485a55eb479be8d74e58e

--
Oscar.

On Tue, Oct 23, 2012 at 6:56 AM, Oscar Fonts <oscar.fonts@anonymised.com> wrote:

2012/10/22 Chris Holmes <cholmes@anonymised.com…1501…>:

Should I be able to just run that script on a geoserver checkout and then
have it wired to the geoserver_test project? And be able to pull down
updates from the transifex server? I’ll try it out, but just want to be sure
that’s the intention.

Yes, that’s the intention.

Awesome, will try to try it out this week.

Also what are your thoughts on bringing over some of the translations from

the geoserver_22x transifex? I suppose we could just create those in the new

repo and move the files over.

Sure. Gave you access to geoserver_test if you want to play.

I understand that the definitive place for translations will still be
github, periodically pulling from transifex, right?

Yup. And I’d like us to improve the GS developer docs to make pulling from transifex as easy as possible. Perhaps try to recruit a ‘localization lead’ who can pull in periodically and commit. And also let people who do translations notify that person that they finished up a language.

Regarding character encoding:

Really? Gabriel and I didn’t successfully do this, and it looked like there
might be some problems with how GeoServer reads them in.

You are right, it works when running jetty from inside Eclipse, but
the web-core translation file is corrupted when building with maven.

After a bit of investigation, the problem is with the
maven-antrun-plugin in web/core/pom.xml
You have to declare the resource files encoding as ISO-8859-1. See commit:
https://github.com/oscarfonts/geoserver/commit/b172716e86536ddfad8485a55eb479be8d74e58e

Oh awesome. I’ll try to test with Gabriel to confirm.


Oscar.