[Geoserver-users] Using weblate in replacement of transifex - First feedbacks

Hi all,

I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.

Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).

Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.

I have already two points to share with you to get some feedback:

  • First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
  • Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example:

org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Clé d’authentification
is replaced by
org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Cl\u00E9 d’authentification

(my own change in the translation was to add a space at the end of the string, to match the original layout of the source string)

From a technical point of view, it does not break anything but it would make it more difficult to work on a translation without using weblate.

I will continue some tests on the integration with GitHub and will let you know the results.

Thank you for your feedback !

···

Alexandre Gacon

Thanks for the experiment :slight_smile:

You may want to chat about this on the geoserver-devel list as it is about care and feeding of the codebase.

I did not understand “have two different items for the same language”, do you have an example you can link to?

Jody

···


Jody Garnett

You might be interested to know that JDK8 and lower have a tool native2asii that does bidirectional conversion between \u Unicode points and actual ISO-8859-1

For JDK9+, this is deprecated/removed because JDK9 first tries to use UTF-8, then falls back to ISO-8859-1 if that fails. So \u is guaranteed to work in both JDK8- and JDK9+, accented characters will probably work but it is not guaranteed (they might accidentally form a valid UTF-8 sequence).

My personal experience: The maven resource plugin defaults to the platform encoding e.g., when filtering is enabled, so if you don’t use \u there is a chance that your build randomly breaks on hosts with an UTF-8 native encoding.

See

https://docs.oracle.com/en/java/javase/17/migrate/removed-tools-and-components.html#GUID-B49A964D-A2EF-4DAF-8A71-A64EF3E77C00 (bottom)

https://docs.oracle.com/en/java/javase/17/intl/internationalization-enhancements1.html#GUID-9DCDB41C-A989-4220-8140-DBFB844A0FCA (bottom)

Hans

···

From: Jody Garnett <jody.garnett@…84…>
Sent: dinsdag 9 augustus 2022 15:53
To: Alexandre Gacon <alexandre.gacon@…84…>
Cc: geoserver-users geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] Using weblate in replacement of transifex - First feedbacks

Thanks for the experiment :slight_smile:

You may want to chat about this on the geoserver-devel list as it is about care and feeding of the codebase.

I did not understand “have two different items for the same language”, do you have an example you can link to?

Jody

On Mon, Aug 8, 2022 at 3:29 AM Alexandre Gacon <alexandre.gacon@…84…> wrote:

Hi all,

I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.

Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).

Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.

I have already two points to share with you to get some feedback:

  • First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
  • Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example:

org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Clé d’authentification

is replaced by

org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Cl\u00E9 d’authentification

(my own change in the translation was to add a space at the end of the string, to match the original layout of the source string)

From a technical point of view, it does not break anything but it would make it more difficult to work on a translation without using weblate.

I will continue some tests on the integration with GitHub and will let you know the results.

Thank you for your feedback !

Alexandre Gacon


Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer

Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Jody Garnett

Hi Jody,

I will subscribe to the devel list to extend the discussion there.

For the two different items for the same language, Weblate is complaining about the Chinese language: for example in https://github.com/geoserver/geoserver/tree/main/src/web/core/src/main/resources, you have one file in ISO encoding and one file in UTF-8 encoding, both for ZH (the unicode one is for ZH_CN but it is the default country code for the language CN).

Regards
Alexandre

···

Alexandre Gacon

Thank you Hans for pointing me at the natice2ascii tool but I would like to have something more straightforward between github and the translation solution.

For the use of not \u encoded characters I understand your point but currently a lot of the translations (for the most ancient ones and the most complete ones) are still in this case.

Regards
Alexandre

···

Alexandre Gacon

Hello,

My proposal would be:

  • Do a bulk-conversion with native2ascii, so all .properties files are ascii.
    On linux this would be something along the lines of (and presuming you can revert source control, the rm might be destructive and I did not test this ):

find . -name “*.properties” -exec mv {} {}.old “;” -exec native2ascii {}.old {} “;” -exec rm {}.old “;”

  • If not enabled, you can now set <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  • Use some maven plugin to enforce that *.properties is ASCII-only. E.g. the maven enforcer plugin, with the third party plugin for required encoding [1]

This does a 1-shot migration to the same standard that weblate supports while remaining JDK8- and JDK9+ compliant. Maven will enforce that this standard is followed. As it is a mechanical translation, chances for breakage are minimal.

Result for your concerns:

  • More straightforward → you only have to use the tool once
  • Mixed standard - > After the bulk conversion and enforcement not anymore
  • \u escapes are less readable → Correct, unfortunately, but only if your IDE guesses the encoding right, and only for western European languages where ISO-8859-1 is the ‘correct’ local encoding. My personal experience with Dutch/French/German is that you quickly learn what each relevant \u is

[1] https://www.mojohaus.org/extra-enforcer-rules/requireEncoding.html

Hans

···

From: Alexandre Gacon <alexandre.gacon@…84…>
Sent: donderdag 11 augustus 2022 8:04
To: Hans Yperman <hans.yperman@…1715…>
Cc: Jody Garnett <jody.garnett@…84…>; geoserver-users geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] Using weblate in replacement of transifex - First feedbacks

Thank you Hans for pointing me at the natice2ascii tool but I would like to have something more straightforward between github and the translation solution.

For the use of not \u encoded characters I understand your point but currently a lot of the translations (for the most ancient ones and the most complete ones) are still in this case.

Regards

Alexandre

Le mar. 9 août 2022 à 16:16, Hans Yperman <hans.yperman@…1715…> a écrit :

You might be interested to know that JDK8 and lower have a tool native2asii that does bidirectional conversion between \u Unicode points and actual ISO-8859-1

For JDK9+, this is deprecated/removed because JDK9 first tries to use UTF-8, then falls back to ISO-8859-1 if that fails. So \u is guaranteed to work in both JDK8- and JDK9+, accented characters will probably work but it is not guaranteed (they might accidentally form a valid UTF-8 sequence).

My personal experience: The maven resource plugin defaults to the platform encoding e.g., when filtering is enabled, so if you don’t use \u there is a chance that your build randomly breaks on hosts with an UTF-8 native encoding.

See

https://docs.oracle.com/en/java/javase/17/migrate/removed-tools-and-components.html#GUID-B49A964D-A2EF-4DAF-8A71-A64EF3E77C00 (bottom)

https://docs.oracle.com/en/java/javase/17/intl/internationalization-enhancements1.html#GUID-9DCDB41C-A989-4220-8140-DBFB844A0FCA (bottom)

Hans

From: Jody Garnett <jody.garnett@…84…>
Sent: dinsdag 9 augustus 2022 15:53
To: Alexandre Gacon <alexandre.gacon@…84…>
Cc: geoserver-users <geoserver-users@lists.sourceforge.net>
Subject: Re: [Geoserver-users] Using weblate in replacement of transifex - First feedbacks

Thanks for the experiment :slight_smile:

You may want to chat about this on the geoserver-devel list as it is about care and feeding of the codebase.

I did not understand “have two different items for the same language”, do you have an example you can link to?

Jody

On Mon, Aug 8, 2022 at 3:29 AM Alexandre Gacon <alexandre.gacon@…84…> wrote:

Hi all,

I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.

Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).

Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.

I have already two points to share with you to get some feedback:

  • First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
  • Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example:

org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Clé d’authentification

is replaced by

org.geoserver.security.GeoServerAuthenticationKeyFilter.name=Cl\u00E9 d’authentification

(my own change in the translation was to add a space at the end of the string, to match the original layout of the source string)

From a technical point of view, it does not break anything but it would make it more difficult to work on a translation without using weblate.

I will continue some tests on the integration with GitHub and will let you know the results.

Thank you for your feedback !

Alexandre Gacon


Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer

Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Jody Garnett

Alexandre Gacon

Thank you Hans for your clear explanation. We have to check with the developers if such changes are ok for them.

Alexandre

···

Alexandre Gacon