Re: I18N question

Tomasz Kłoczko <kloczek@xxxxxxxxxxxxxxxxxx> · Sun, 14 Jan 2007 20:07:25 +0100 (CET)

On Fri, 12 Jan 2007, Jeff Johnson wrote:

On Jan 12, 2007, at 4:30 PM, Jeszenszky Peter wrote:

Hello,

I'm working on a Java application that extracts package metadata from RPM
packages and transforms it to RDF, making RPM metadata available to
Semantic Web applications.

My question is about the RPMTAG_HEADERI18NTABLE header tag. The value of
this tag is a string array that has only one element ("C") in each RPM
package that I have seen. Is this I18N feature really supported in RPM-
aware software? May I assume that every string in the header is encoded
in plain old US-ASCII?

RPMTAG_HEADERI81NTABLE stores the keys (i.e. locales) for an associative 
array of strings.

No RedHat package since RHL 6.2 has used the mechanism. What is done instead
is to use a 2 level lookup of msgid's and msgstr's using dcgettext(). There 
are still
distros that use locale's however, PLD being the first that comes to mind.

There are only 3 tags that ever used the associative array lookup:
  RPMTAG_SUMMARY
  RPMTAG_DESCRIPTION
  RPMTAG_GROUP

Meanwhile the much harder problem to solve is that strings in *.rpm
metadata have no well defined encoding. One cannot assume US-ASCII,
not UTF-8, nor LATIN1, nor anything else. Only '\0' terminated may be 
assumed.

Storing all translations in .spec file is very hard to maintain specialy
when you want separate real package developers work from translators. Also 
any changes in reference Summary/%descriptions/Group spreaded across all 
.spec files are hard for tracking for translators. Also put all translated 
phrases directly in .spec file in case like RH/FC where are maintained 
translations for ~60 diffent languages will transform all current .spec 
files to VERY HUGE files where buid specyfications will be only small 
portion of this files. Jeff this is REAL causes why most distros use 2
level lookup.

I see some kind of solution for above and let me allow draw some ideas 
about developers framevork for it:

- it will best keepe all generated raw data for translator in
  well known by translatiors classic gettext .po files stored in:

  %define	_podir	%{_topdir}/po

  directory. Allso will be good have %{_topdir}/po/LINGUAS file with list
  of all maintained languages. This will  allow store all neccessary
  files in separated module in version control system repository.

- rpmbuild can provide option which will will generate small .po file
  sucked from .spec file and msgmerge it to %{_topdir}/po/PACKAGES.pot all
  parameter .spec file ("rpmbuild -bo <foo>.spec" ?). For make this merge
  corerent all previouse entries (marked in commnents) for <foo>.spec
  before merge must be removed.

- during handle -b{a|b|c} options in rpmbuild this tool will suck all
  translated phrases for all languages stored in %{_topdir}/po/LINGUAS
  and if it will be diffrent from reference "C" Summary/%descriptions/Group
  this can be stored in RPMTAG_*I18N* fields in just generated binary
  packages.

Probably this way will allow make all translation more up-to-date and will 
not disturb package developers by make add translations prosess to just 
generated .rpm files completly silent/trasparent.

kloczek
--
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@xxxxxxxxxxxxxxxxxx*
_______________________________________________
Rpm-list mailing list
Rpm-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/rpm-list