On Sun, May 30, 2010 at 01:46, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > Hi Ævar, Hi, and thanks for taking the time to review this. > Ævar Arnfjörð Bjarmason wrote: > >> I made three strings in git-pull.sh translatable as a proof of >> concept. One problem that I ran into is that xgettext(1) seems >> very particular when picking up translation strings. It accepts >> this: >> >> gettext "hello world"; echo > > Does ‘gettext -s "hello world"’ work, too? (Just curious.) No, that just makes "-s" translatable. Even options that gettext accepts don't work either, you have to use eval_gettext "\$foo" instead of gettext -e "\$foo". The xgettext program is quite naïve like that. >> but not this: > [...] >> >> gettext <<"END"; >> hello world >> END >> >> Maybe there's a way to make it play nice. But I just used a large >> multiline string as a workaround. > > Not so nice, but it seems that gettext expects a message id as > an argument (i.e., it will only replace echo and not cat). Yes. I mailed the maintainer about this. gettext would need to accept text on STDIN and xgettext would need to find the messages for it to work. In the meantime we could just use multiline strings. It works for the test suite. >> I don't know what to do about >> 'die gettext' other than define a 'die_gettext' wrapper function >> and use `xgettext --keyword=die_gettext'. > > Sounds sensible. > >> One thing I haven't done is to try to go ahead and make massive >> changes to the Git source code to make everything translatable. > > I am vaguely worried about performance. Suppose a function does > > for (i = 0; i < 1000000; i++) > printf(_("Some interesting label: %s\n"), foo(i)); > > Will this compile to the equivalent of > > const char *s = _("Some interesting label: %s\n"); > for (i = 0; i < 1000000; i++) > printf(s, foo(i)); > > Suppose someone decides to make that change by hand (maybe the > loop is too large for the compiler to notice the potential > winnings). Then presumably gcc cannot be able to type-check the > format any more. Is there some way around this that avoids > both speed regressions and loss of type-safety? Any level of indirection is of course going to be slower, there's no way around that. I made two test programs to test this out: test-in-loop.c: #include <stdio.h> #include <stdlib.h> #include <locale.h> #include <libintl.h> #define _(s) gettext(s) int foo(long int x) { return x * x; } int main(void) { const char *podir = "/usr/local/share/locale"; if(!podir) puts("zomg error"); char *ret = bindtextdomain("git", podir); ret = setlocale(LC_MESSAGES, ""); ret = setlocale(LC_CTYPE, ""); ret = textdomain("git"); for (long int i = 0; i < 10000000; i++) { printf(_("Some interesting label: %ld\n"), foo(i)); } return 0; } test-outside-loop.c: #include <stdio.h> #include <stdlib.h> #include <locale.h> #include <libintl.h> #define _(s) gettext(s) int foo(long int x) { return x * x; } int main(void) { const char *podir = "/usr/local/share/locale"; if(!podir) puts("zomg error"); char *ret = bindtextdomain("git", podir); ret = setlocale(LC_MESSAGES, ""); ret = setlocale(LC_CTYPE, ""); ret = textdomain("git"); const char *s = _("Some interesting label: %ld\n"); for (long int i = 0; i < 10000000; i++) printf(s, foo(i)); return 0; } Note that I use 10 million iterations, not 1 million like in your example. Here's how they compile: $ gcc -std=c99 -o test-in-loop test-in-loop.c ; gcc -std=c99 -o test-outside-loop test-outside-loop.c test-in-loop.c: In function ‘main’: test-in-loop.c:21: warning: format ‘%ld’ expects type ‘long int’, but argument 2 has type ‘int’ I.e. your concerns are valid. GCC won't catch an invalid format specifier in this case. And even though gettext tries to make cases like these fast (http://www.gnu.org/software/hello/manual/gettext/Optimized-gettext.html) it's still a lot slower than hardcoded English: perl -MBenchmark=:all -MData::Dump=dump -E 'cmpthese(10, { outside => sub { system "./test-outside-loop >/dev/null" }, inside => sub { system "./test-in-loop >/dev/null" }, });' s/iter inside outside inside 13.4 -- -83% outside 2.26 495% -- > Apologies if this was already answered in the earlier discussion. What you can do (and this was covered) is to use msgfmt to check that no translations use different format specifiers. But hopefully cases where you have messages like these in tight loops and the message lookup itself is a significant contributor to the program time will be so rare as to not be an issue. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html