Sometimes tests run with GETTEXT POISON fail because of a reason other than a translated string that should not have been translated. In such a case an error message from a git command in the test's verbose output is usually, well, less than helpful: error: # GETTEXT POISON # or fatal: # GETTEXT POISON #: No such file or directory It's especially annoying on those rare occasions when a heisenbug decides it's a good time to suddenly reveal its presence during a GETTEXT POISON test run, and all we get is an error message like these (yes, I did actually see both of the above error messages only once). Make builtin commands' GETTEXT POISON-ed error messages more useful for debugging failures by introducing a new mode of poisoning: if $GIT_GETTEXT_POISON is set to 'scrambled', then include the original untranslated message after that "# GETTEXT_POISON #" string in a scrambled form, interspersing a '.' after each character. This way the messages will remain gibberish enough for machine consumption as they were before, but at the same time they will be relatively easily legible for humans. Take extra care to preserve printf() format conversion specifiers unaltered when inserting those dots. Leave 'git-sh-i18n.sh' unchanged, because translatable messages in scripts often include shell variables, and they could (though currently they don't) include printf format specifiers, parameter expansions, command substitutions and whatnot, too. Dealing with those in a shell script would be too much hassle without its worth. There is an additional benefit: as this change considerably increases the size of translated messages, it could detect cases when we try to format a translated string into a too small buffer. E.g. this change applied on old versions causes test failures because of the bug that was fixed in 2cfa83574c (bisect_next_all: convert xsnprintf to xstrfmt, 2017-02-16). [TODO: Fallout? A 'printf(_("foo: %s"), var);' call includes the contents of 'var' unscrambled in the output. Could that hide the translation of a string that should not have been translated? I'm afraid yes: to check the output of that printf() a sloppy test could do: git plumbing-cmd >out && grep "var's content" out which would fail in a regular GETTEXT_POISON test run, but would succeed in a scrambled test run. Does this matter in practice, do we care at all? Does gettext_scramble() need a FORMAT_PRESERVING annotation? Seems to work fine without it so far...] Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx> --- gettext.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++--- gettext.h | 11 +++++++++-- 2 files changed, 60 insertions(+), 5 deletions(-) diff --git a/gettext.c b/gettext.c index c50d1e0377..8ba7fd0bea 100644 --- a/gettext.c +++ b/gettext.c @@ -52,13 +52,61 @@ enum poison_mode use_gettext_poison(void) static enum poison_mode poison_mode = poison_mode_uninitialized; if (poison_mode == poison_mode_uninitialized) { const char *v = getenv("GIT_GETTEXT_POISON"); - if (v && *v) - poison_mode = poison_mode_default; - else + if (v && *v) { + if (!strcmp(v, "scrambled")) + poison_mode = poison_mode_scrambled; + else + poison_mode = poison_mode_default; + } else poison_mode = poison_mode_none; } return poison_mode; } + +static int conversion_specifier_len(const char *s) +{ + const char printf_conversion_specifiers[] = "diouxXeEfFgGaAcsCSpnm%"; + const char *format_end; + + if (*s != '%') + return 0; + + format_end = strpbrk(s + 1, printf_conversion_specifiers); + if (format_end) + return format_end - s; + else + return 0; +} + +const char *gettext_scramble(const char *msg) +{ + struct strbuf sb; + + strbuf_init(&sb, + /* "# GETTEXT_POISON #" + ' ' + "m.e.s.s.a.g.e." + '\0' */ + strlen(GETTEXT_POISON_MAGIC) + 1 + 2 * strlen(msg) + 1); + + strbuf_addch(&sb, ' '); + while (*msg) { + if (*msg == '\n') { + strbuf_addch(&sb, *(msg++)); + continue; + } else if (*msg == '%') { + int spec_len = conversion_specifier_len(msg); + if (spec_len) { + strbuf_add(&sb, msg, spec_len); + msg += spec_len; + continue; + } + } + + strbuf_addch(&sb, *(msg++)); + strbuf_addch(&sb, '.'); + } + + /* This will be leaked... */ + return strbuf_detach(&sb, NULL); +} #endif #ifndef NO_GETTEXT diff --git a/gettext.h b/gettext.h index fcb6bfaa2c..d21346d9fa 100644 --- a/gettext.h +++ b/gettext.h @@ -45,10 +45,12 @@ static inline int gettext_width(const char *s) enum poison_mode { poison_mode_uninitialized = -1, poison_mode_none = 0, - poison_mode_default + poison_mode_default, + poison_mode_scrambled }; extern enum poison_mode use_gettext_poison(void); +extern const char *gettext_scramble(const char *msg); #define GETTEXT_POISON_MAGIC "# GETTEXT POISON #" #endif @@ -60,6 +62,8 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid) #ifdef GETTEXT_POISON if (use_gettext_poison() == poison_mode_default) return GETTEXT_POISON_MAGIC; + else if (use_gettext_poison() == poison_mode_scrambled) + return gettext_scramble(gettext(msgid)); #endif return gettext(msgid); } @@ -67,11 +71,14 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid) static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2) const char *Q_(const char *msgid, const char *plu, unsigned long n) { + const char *msg = ngettext(msgid, plu, n); #ifdef GETTEXT_POISON if (use_gettext_poison() == poison_mode_default) return GETTEXT_POISON_MAGIC; + else if (use_gettext_poison() == poison_mode_scrambled) + return gettext_scramble(msg); #endif - return ngettext(msgid, plu, n); + return msg; } /* Mark msgid for translation but do not translate it. */ -- 2.19.1.681.g6bd79da3f5