I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.
Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).
Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.
I have already two points to share with you to get some feedback:
First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example:
You might be interested to know that JDK8 and lower have a tool native2asii that does bidirectional conversion between \u Unicode points and actual ISO-8859-1
For JDK9+, this is deprecated/removed because JDK9 first tries to use UTF-8, then falls back to ISO-8859-1 if that fails. So \u is guaranteed to work in both JDK8- and JDK9+, accented characters will probably work but it is not guaranteed (they might accidentally form a valid UTF-8 sequence).
My personal experience: The maven resource plugin defaults to the platform encoding e.g., when filtering is enabled, so if you don’t use \u there is a chance that your build randomly breaks on hosts with an UTF-8 native encoding.
From: Jody Garnett <jody.garnett@…84…> Sent: dinsdag 9 augustus 2022 15:53 To: Alexandre Gacon <alexandre.gacon@…84…> Cc: geoserver-users geoserver-users@lists.sourceforge.net Subject: Re: [Geoserver-users] Using weblate in replacement of transifex - First feedbacks
Thanks for the experiment
You may want to chat about this on the geoserver-devel list as it is about care and feeding of the codebase.
I did not understand “have two different items for the same language”, do you have an example you can link to?
I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.
Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).
Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.
I have already two points to share with you to get some feedback:
First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example:
I will subscribe to the devel list to extend the discussion there.
For the two different items for the same language, Weblate is complaining about the Chinese language: for example in https://github.com/geoserver/geoserver/tree/main/src/web/core/src/main/resources, you have one file in ISO encoding and one file in UTF-8 encoding, both for ZH (the unicode one is for ZH_CN but it is the default country code for the language CN).
Thank you Hans for pointing me at the natice2ascii tool but I would like to have something more straightforward between github and the translation solution.
For the use of not \u encoded characters I understand your point but currently a lot of the translations (for the most ancient ones and the most complete ones) are still in this case.
Do a bulk-conversion with native2ascii, so all .properties files are ascii.
On linux this would be something along the lines of (and presuming you can revert source control, the rm might be destructive and I did not test this ):
If not enabled, you can now set <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
Use some maven plugin to enforce that *.properties is ASCII-only. E.g. the maven enforcer plugin, with the third party plugin for required encoding [1]
This does a 1-shot migration to the same standard that weblate supports while remaining JDK8- and JDK9+ compliant. Maven will enforce that this standard is followed. As it is a mechanical translation, chances for breakage are minimal.
Result for your concerns:
More straightforward → you only have to use the tool once
Mixed standard - > After the bulk conversion and enforcement not anymore
\u escapes are less readable → Correct, unfortunately, but only if your IDE guesses the encoding right, and only for western European languages where ISO-8859-1 is the ‘correct’ local encoding. My personal experience with Dutch/French/German is that you quickly learn what each relevant \u is
From: Alexandre Gacon <alexandre.gacon@…84…> Sent: donderdag 11 augustus 2022 8:04 To: Hans Yperman <hans.yperman@…1715…> Cc: Jody Garnett <jody.garnett@…84…>; geoserver-users geoserver-users@lists.sourceforge.net Subject: Re: [Geoserver-users] Using weblate in replacement of transifex - First feedbacks
Thank you Hans for pointing me at the natice2ascii tool but I would like to have something more straightforward between github and the translation solution.
For the use of not \u encoded characters I understand your point but currently a lot of the translations (for the most ancient ones and the most complete ones) are still in this case.
You might be interested to know that JDK8 and lower have a tool native2asii that does bidirectional conversion between \u Unicode points and actual ISO-8859-1
For JDK9+, this is deprecated/removed because JDK9 first tries to use UTF-8, then falls back to ISO-8859-1 if that fails. So \u is guaranteed to work in both JDK8- and JDK9+, accented characters will probably work but it is not guaranteed (they might accidentally form a valid UTF-8 sequence).
My personal experience: The maven resource plugin defaults to the platform encoding e.g., when filtering is enabled, so if you don’t use \u there is a chance that your build randomly breaks on hosts with an UTF-8 native encoding.
I am studying if we could migrate the translation tooling from Transifex to Weblate. I have started this because with the current setup Transifex is changing a lot of translations when I upload updates of the translation source, making it difficult to do the synchronization between GitHub and Transifex.
Weblate is a copyleft libre software and OSGeo is hosting its own instance, already used by several OSGeo projects (postgis, pgrouting and grass gis at least).
Thanks to Regina Obe, I have set up a GeoServer project on the OSGeo instance to study how weblate works and if there is something which can prevent us from using it.
I have already two points to share with you to get some feedback:
First, when you configure a component into weblate, you cannot have two items for the same language, even if they are in a different encoding. As a consequence, I cannot directly integrate most of the core components since they contain 2 files for the Chinese language: is it something which can be changed? Which one is used by GeoServer?
Second, when you change the translation of a text in weblate, it automatically replaces special characters by their equivalent in unicode, even if the character exists in the ISO-8859-1 encoding. For example: