Daniel Egger wrote: > > Whatever the solution regarding GIMP tips turns out to be, translators > > want to be able to translate them from within po files. I hope everyone > > has agreed on that :) > > not really. Okay, but that really makes you an exception among translators. This discussion isn't new, it has been repeated for ages and happens every time a developer does not understand why po format should be used, but rather wants his own "brilliant" hack to reinvent the wheel, without understanding why po format is essential to the majority of translators. The short answer is "the tools". gettext is industry standard, and there are a huge amount of tools for creating, maintaining and reusing translations in this format. Also the few tools included in GNU gettext itself has many important features. As far as I know, no translator has ever objected to the need of po format for these reasons, and we have discussed this extensively. The problem of people inventing more and more different formats to keep their software messages in (.oaf, .sheet, various xml formats, .desktop, .soundlist, .directory, etc etc) in GNOME was a major pain to translators, and that eventually resulted in the development of xml-i18n-tools as a middle layer, allowing developers to use their formats (with those advantages that gives) while on the same time allowing translators to use their format (with those advantages that it gives). Currently it's used for the majority of GNOME modules and the plan is to use it for all of them. There's no disagreement about that, not that I know of at least. > > If you go for XML, I'd recommend using intltool. It's a set of tools > > designed exactly for this purpose. Since gettext itself doesn't have a > > clue about XML, intltool works as a middle layer that extracts strings > > marked for translation in the XML and adds them to header files, so that > > xgettext can extract them and put them in a pot. The reverse process is > > usually done at build time, and all the translations merged back into > > the original XML file. > > > You can find intltool in the xml-i18n-tools module in CVS. > > Okay, so why would one want this heavy conversion action? If the only > purpose is to have only one editable catalog instead of several files > and people really need that then okay... I have already mentioned the disadvantages of a single translation file in my previous mail, but there are many more advantages to po format than that. Basically it amounts to the fact that there's much more to translation than just creating a translation. In many cases, creating the initial translation is the easiest part time-wise: maintaining the translation as the software evolves (often for many years) and updating and adding translations of individual messages as they get added to the source over time, usually takes more effort over a much longer period of time. This is the single largest weakness of your proposal, it doesn't mention anything of how this is to be solved, while gettext already has features for this. For the initial creation of a translation, the technology with using a translation memory is becoming more common. This is a single large collection of all existing translations in po format, that are re-used for the new translation by running a special tool. My memory is currently more than 6 MB of text, and gives up to 25% - 30% (depending on the pot file) of exact matches in a new translation. That means 25% to 30% less work for me when creating the translation, which usually amounts to many hours of saved work. Also, even if the number of exact matches are smaller, the number of close matches ("fuzzy" matches) are usually large, and these close matches usually save much time when translating (I don't have to do a complete translation of this message from scratch but usually only have to make smaller adjustments) and also helps improving consistency in translations, so that they use the same translations of identical terminology and writing. Translation memories can also be used for maintaining translations - as new messages are added, you can re-run the translation against the translation memory and match them against existing translations this way. I myself don't use them this way but solely for new translations, but I know other people that also use them this way. Nevertheless, these translation memory tools use the po format since this is what is used across free software translations, and if you have decided upon another format, you have to deal with making existing translation memory tools usable with it. Anything else is a step backwards. However, that was only the problems of the cration of translations, while I previously mentioned that maintaining is the main work. Among other things, the gettext tools themselves help with the following issues related to updating translations: * Fuzzymarking of changed messages. This is a really important feature. If and when an original message is changed, translators need to easily be notified about it, to be able to update their translation accordingly. This is automatically handled by gettext, and messages that have changed are marked "fuzzy" until the translator updates the translation. * Fuzzymarking of new messages. In a similar manner, new messages that are added to the sources are matched against existing messages with translations, and if they have similarities the most closely matching translation is automatically picked and marked fuzzy, so that the translator can make only the appropriate changes, instead of having to re-translate this message from scratch. This feature is most essential when translating any larger message. There are more features, but the above are the essential ones in this case. They are unfortunately not trivial to reimplement, and on the same time very essential to effective translation. Even if all this should be reimplemented and the wheel reinvented, the issue remains with compability with all existing tools. I have already mentioned translation memory tools and other translation tools, but there's a lot more that depends on, and is designed for, the po format. One such thing would be simply translation statistics. Translation statistics are important to translators in that it is an essential tool when deciding on where to devote work at the moment (have a look at the http://developer.gnome.org/projects/gtp/status/ pages). These statistics are all based on the use of po format, where statistics for individual translations are easily available by querying msgfmt, and a change in translation format would also require a change also to these statistics tools to be usable with regards to translation status. > > I as a translator also prefer po format... I doubt there is any > > translator that wouldn't. > > I don't. I don't care which format the translations have to be in. > XML is about as easy as .po... Only if you disregard everything else than just the method of inputting text, and even that has its problems with an XML file with all translations thrown together. For all the reasons given in this thread, I cannot see an alternative to the po format as a reasonable alternative, at least not without the backup of some significant amount of code that isn't actually a step backwards for translators. I hope we can agree on the solution using intltool that Sven proposed, and that we can finish this thread. Christian