Re: [PATCH/RFC 0/5] Add internationalization support to Git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason wrote:

> And even though gettext tries to make cases like these fast
> (http://www.gnu.org/software/hello/manual/gettext/Optimized-gettext.html)
> it's still a lot slower than hardcoded English:
> 
>     perl -MBenchmark=:all -MData::Dump=dump -E 'cmpthese(10, {
>          outside => sub { system "./test-outside-loop >/dev/null" },
>          inside =>  sub { system "./test-in-loop >/dev/null" },
>     });'
> 
>             s/iter  inside outside
>     inside    13.4      --    -83%
>     outside   2.26    495%      --

Given:

-- 8< --
#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
#include <libintl.h>
#include "gettext.h"

int foo(long int x) {
	return x * x;
}

int main(void) {
	const char *podir = "/usr/local/share/locale";
	long int i;

	bindtextdomain("git", podir);
	setlocale(LC_MESSAGES, "");
	setlocale(LC_CTYPE, "");
	textdomain("git");

	for (i = 0; i < 1000000; i++)
		printf(_("Some interesting label: %ld\n"), foo(i));

	return 0;
}
-- >8 --

No message catalog is installed here, and I compile with gcc-4.5 -Wall -W -O2.
The results are similar.

A: the standard way.  gettext.h contains "#define _(s) gettext(s)" or
| static inline char *_(const char *s) __attribute__((__format_arg(1)__))
| {
|	return gettext(s);
| }

 6.74user 0.02system 0:06.78elapsed 99%CPU (0avgtext+0avgdata 2304maxresident)k
 0inputs+0outputs (0major+182minor)pagefaults 0swaps

 (about 7 seconds.)

B: noop.  gettext.h contains "#define _(s) s"

 1.35user 0.01system 0:01.37elapsed 99%CPU (0avgtext+0avgdata 2192maxresident)k
 0inputs+0outputs (0major+172minor)pagefaults 0swaps

 (about 1.5 seconds.)

It would seem that __attribute__((__pure__)) should let the compiler give
us the best of both worlds, but no luck.  Even __attribute__((__const__))
is ignored; the compiler inlines the body of _() before it has a chance
to notice.

We can fool the compiler into paying attention by making it not
inlinable: if gettext.h contains

| extern char *_(const char *s) __attribute__((__format_arg__(1), __const__));

and a separate gettext.c contains

| #include <libintl.h>
| #include "gettext.h"
| char *_(const char *s) { return gettext(s); }

we get the performance of B again:

 1.36user 0.01system 0:01.38elapsed 98%CPU (0avgtext+0avgdata 2304maxresident)k
 0inputs+0outputs (0major+180minor)pagefaults 0swaps

This amounts to lying to the compiler, since it is possible for the string
pointed to by a single address s to differ between calls to _.  The __pure__
attribute would be more honest, but for reasons I don’t understand it
suppresses the optimization.

Moral of the story: at least in simple cases, we can keep the performance
and the typechecking.  Phew.

HTH,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]