Hello,
How do you understand the reply of Transifex?
Regards
Alexandre
---------- Forwarded message ---------
De : Ryan Bernstein <support@anonymised.com>
Date: mer. 5 janv. 2022 à 15:33
Subject: Re: GitHub integration - Encoding issue
To: Alexandre Gacon <alexandre.gacon@anonymised.com>
Hello Alexandre,
I hope you are doing well!
It seems that you are very technical, so I’m just going to copy/paste the comments as-is from our developers who looked into this issue quite extensively…
In order to understand the following explanation, keep in mind that:
- UTF-8 is the encoding that will preserve properly all non-ascii, non-latin1 characters
- ISO-5589-1 (aka latin1 ) is a ascii based encoding that contains all the ascii characters plus some additional ones used in the latin alphabet (i.e. é, è etc..)
- us-ascii is the standard encoding for electronic communication and as we already mentioned a subset of the latin1 encoding.
After the new tests regarding the retaining of the encoding of the file given in the ticket, we noticed the following:
- If a non-latin1, non-ascii character exists in the translation (UTF-8 characters) then the final translation file will contain the UTF-8 escaped corresponding characters (i.e. \u0420 corresponds to some Cyrillic letter).
- In our case, the latin1 character wasn’t part of the translated strings but part of the structure of the file, at the template of the file. This means that we don’t want to change it to the UTF-8 escaped character.
- But on the other hand, the library that we are using in order to integrate github with transifex is not supporting latin1 but UTF-8 so when a non-ascii character appears it converts the whole file to the best encoding that can represent that character. In our case that is UTF-8.
In order to preserve the us-ascii encoding (not the latin1) in github one must make sure that the source keys and the comments of the file do not contain any non ascii characters.
In case something wasn’t clear, what this means is that because the source file had a latin1 character (é) even though the translations for the strings did not, this character was kept as-is (not escaped) as part of the “template”. Therefore, the translation files sent back to GitHub are being encoded with UTF-8 by the library being used. We do not think we can do anything about this, unfortunately. So, the translation files for the Java Properties file format must be retrieved from Transifex directly instead of using the GitHub integration.
Is this all clear? Do you have any other questions?
Kind regards,
Ryan
–
Ryan Bernstein
Customer Support Engineer | Transifex
Join Our Community!
Join user research!
How would you rate my reply?
![]()
{#HS:1724125509-162787#}
On Fri, Dec 24, 2021 at 5:54 AM PST, Alexandre Gacon <alexandre.gacon@anonymised.com> wrote:
Yes Ryan.
It is clear now. It is a good beginning since it will allow to have Transifex up-to-date on what remains to translated. A two-way synchronisation would be indeed be better but for Christmas it is a nice present from you.
Have good holidays and enjoy the coming events too !
Regards
AlexandreLe ven. 24 déc. 2021 à 14:50, Ryan Bernstein <support@anonymised.com0…> a écrit :
On Fri, Dec 24, 2021 at 5:49 AM PST, Ryan Bernstein <support@anonymised.com..> wrote:
Hi,
Source files will now be kept in the correct iso-5589-1 encoding/format when using the GitHub integration.
Any translation files pushed back to GitHub via the integration will be in UTF-8 format instead of the correct (iso-5589-1) format. Our developers will continue looking into this, but they haven’t found a solution so far, unfortunately. So, please use the UI (or the newer API/CLI for automation) to download these translation files.
Does this help clarify things?
Best,
Ryan–
Ryan Bernstein
Customer Support Engineer | Transifex
Join Our Community!
Join user research!
On Fri, Dec 24, 2021 at 5:39 AM PST, Alexandre Gacon <alexandre.gacon@anonymised.com> wrote:Hi Ryan,
Just to be sure of what you say: the encoding of the source property file will be kept. If a translation file is provided by Github in the iso encoding, it will be kept too but if a new language is added through Transifex, the encoding will be wrong.
Or all the translations will be in utf-8 when pushed to Github and the only way to have them in the correct encoding will be to use the Transifex UI ?
Regards
AlexandreLe ven. 24 déc. 2021 à 14:15, Ryan Bernstein <support@anonymised.com0…> a écrit :
On Fri, Dec 24, 2021 at 5:15 AM PST, Ryan Bernstein <support@anonymised.com..> wrote:
Hi Alexandre,
OK, we have provided a fix for creating a resource file in Java Properties format.
This means that whenever you create a new resource file, the source file will keep theiso-5589-1encoding instead ofutf-8.Now, in your project where a
utf8source file already exists, in order to change its encoding, there are 2 ways:
- Remove the resource and recreate it by using the GitHub integration.
- Update the content in the remote repository (by adding i.e. a commented line like
# properties file) and in the next sync, this will update the source file as well.
As far as the translated file, we couldn’t find a solution there because there are a lot of things that are happening in external libraries out of our control. For the time being, though, you can get the translated file directly from the TX UI iniso-5589-1encoding.Does this make sense? We hope that this at least provides a way for you to continue using these files with Transifex for now?
Best,
RyanP.S. Happy Holidays!
–
Ryan Bernstein
Customer Support Engineer | Transifex
Join Our Community!
Join user research!
On Mon, Dec 20, 2021 at 10:05 AM PST, Alexandre Gacon <alexandre.gacon@anonymised.com> wrote:Ok, thanks for the update!
Le lun. 20 déc. 2021 à 14:24, Ryan Bernstein <support@anonymised.com0…> a écrit :
On Mon, Dec 20, 2021 at 5:23 AM PST, Ryan Bernstein <support@anonymised.com..> wrote:
Hello Alexandre,
I hope you are well!
I just wanted to send a quick update that our developers are still looking into this. They do not have a resolution yet, but are trying to determine what can be done. We will keep you posted on any updates…
Kind regards,
Ryan–
Ryan Bernstein
Customer Support Engineer | Transifex
Join Our Community!
Join user research!
On Wed, Dec 15, 2021 at 10:35 AM PST, Alexandre Gacon <alexandre.gacon@anonymised.com> wrote:Hi Ryan,
Happy to hear that you are going to try to solve this! I find that Transifex is a wonderful solution and I would be very happy if we manage to use it more and more for open source projects!
Regards
AlexandreLe mer. 15 déc. 2021 à 19:27, Ryan Bernstein <support@anonymised.com0…> a écrit :
On Wed, Dec 15, 2021 at 10:26 AM PST, Ryan Bernstein <support@anonymised.com…> wrote:
Hello Alexandre,
I hope you are well! Please allow me to jump in here in place of Cesar.
We have verified what you said, and are looking into it. We will probably need to create a ticket to get this resolved by our developers.
Further, I tried converting the translation file to the correct ISO-8859-1 encoding (using Sublime Text), but that didn’t work, unfortunately…
So, we understand that this issue needs to be addressed, and we will definitely keep you updated on our progress!
Our apologies for this issue
Kind regards,
Ryan–
Ryan Bernstein
Customer Support Engineer | Transifex
Join Our Community!
Join user research!
On Wed, Dec 15, 2021 at 9:11 AM PST, Alexandre Gacon <alexandre.gacon@anonymised.com> wrote:Hello,
We did another try with the attached files. As you can see, both are encoded in ISO-8859-1.
After completing the translation and validating it, we received in the Pull Request of GitHub a file encoded as UTF-8.
Do you have any suggestions on how to solve this?
Regards
Alexandre GaconLe lun. 13 déc. 2021 à 23:06, Cesar Garcia <support@anonymised.com..> a écrit :
