Daniel Egger wrote: > IF we need to add another dependency it has to be worth it. Solving > problems by using XML for everything seems only clever to me. It does > not make sense to use XML for tip files while plug-ins still keep > in beeing broken (in the localisation context). > > > Someone mentioned how well Dia seemed to be doing in that respect. > > Well, Dia puts the text strings for a sheet in a different file per > > sheet. Even with only 8 supported languages, this already looks > > totally cluttered to me. > > Really? Everything is were it belongs to and nothing is used within > wrong context and, last but not least, its extensible and that even > easily. Dia uses intltool/xml-i18n-tools for sheet files. > > The tips file is 9 kB now. With 15 supported languages (how many on > > the way?), that would become 135 kB. > > In contrary to po files untranslated messages are simply nonexistant. > And you forget one thing: All .po files together are by definition > bigger since the original text is repeated within every single file. And that is for a good reason (see below)... > > You cannot expect translators to wade through 30 lines of other > > languages to be able to add his/her own translation (30 lines per > > string to be translated, that is), so that translators do need to work > > on separate files. > > Why not? Because one of the fundamentals of easy translation is simply to have the original text handy. This is so you can easily compare the original and the translation, and ensure that the translation is entirely correct. I have to visually compare the strings many times during the translation of a single message, and at least twice: first to interpret the message I'm about to translate, and finally to compare what I wrote with the original so nothing got lost or added or any meaning changed in the translation. This means that the original string and the translation should be as close as possible to each other, and this is why po format has the messages this way: First the original, and immediately below the space for the translation. If you add a large number of translations to a single file and expect me to edit it, I have to skip a large number of unrelevant "garbage" (since I'm usually not at all interested in the other translations) just to compare the original and my translation. This makes the process of visually verifying translations harder. Another more dangerous thing is encodings. Multiple encodings in a single file don't mix well. I've got bitten too many times by other translators accidentally saving the whole file with their encoding and thus ruining my and many other's translations. Actually this was one of the most important reasons why we went away from editing .desktop files directly in GNOME: With hundreds of translators, the danger of someone accidentally doing this became very imminent (happened quite frequently) and it became a pain to ensure that translations weren't broken because of simple "accidents" like this. Also, it became a mess to "clean up" since effectively all translators had to be contacted to verify that their translations were still correct after such an accident. While enforcing the use of UTF-8 solves the encodings problem, it is not feasible for many other reasons. One is simply the lack of support in many editors and many other text processing tools (and terminals and so on). Effectively enforcing a particular editor hasn't worked yet, and probably never will, and it will probably take more time until all editors natively support UTF-8. Also, many translators use "translation memories", that is large po format databases with existing translations, created and managed by special translation memory. I use such a memory with all my existing translations (it's 6.4 MB of text) to automatically generate a skeleton for all new po format translations, with messages similar to existing translations already translated. Aside from the fact that this won't work if you don't use po format, this points out the encodings problem again: If you force me to use UTF-8, I have to maintain several translation memories instead of a single one, one for each encoding. So while the storage of all translations in UTF-8 solves its shares of problems, it creates new ones for translators. This is why intltool lets translators use their encoding when translating, and converts it to UTF-8 when needed. > And where do you get the 30 from? If you have 15 languages then > you'll have at maximum 15 times the original text to skip. And that is still a problem, as explained above. 15 lines of irrelevant text inbetween every single message and its translation into my language makes verifying translations an unnecessary difficult burden. > Beeing a translator myself (and in fact also one of the one of the DIA sheets) > I can tell that this is not as evil as it might look. Dia uses intltool now, so it seems they have recognized the problems the translators had. > > so I expect you have got a tool for the translators in mind? > > If necessary I can hack something up but it should not be necessary. > I really don't see the big difference to hacking a .po file. It is necessary. po format and gettext have many important features that translators depend upon, something I have previously experienced that almost every translator knew. If you do an alternative "hack", it better support most of the features that gettext has and translators need. More of this in another letter. > > gettext is also a standard. > > Great. Show me the specs... I'm not talking about de-facto or so > called "industry-standards". gettext is such a crap that I really > doubt there was a standarisation process which led to a proper > specification. gettext has evolved. It has much of the features that translators need. And, as you admit, it's industry standard. If you want to replace it, you'd better write a better and fully compatible alternative (since a lot of tools across many platforms are designed to work with this industry standard), while keeping all existing features. I beleive this is where people use the phrase "show me the code". Christian