SZEDER Gábor <szeder.dev@xxxxxxxxx> writes: > On Mon, Oct 22, 2018 at 05:36:33PM +0200, Nguyễn Thái Ngọc Duy wrote: >> >> The current gettext() function just replaces all strings with >> '# GETTEXT POISON #' including format strings and hides the things >> that we should be allowed to grep (like branch names, or some other >> codes) even when gettext is poisoned. >> >> This patch implements the poisoned _() with a universal and totally >> legit language called Ook [1]. We could actually grep stuff even in >> with this because format strings are preserved. > > Once upon a time a GETTEXT_POISON build job failed on me, and the > error message: > > error: # GETTEXT POISON # > > was not particularly useful. Ook wouldn't help with that... > > So I came up with the following couple of patches that implement a > "scrambled" format that makes the poisoned output legible for humans > but still gibberish for machine consumption (i.e. grep-ing the text > part would still fail): > > error: U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File exists... > > I have been running GETTEXT_POISON builds with this series for some > months now, but haven't submitted it yet, because I haven't decided > yet whether including strings (paths, refs, etc.) in the output as > they are is a feature or a flaw. And because it embarrassingly leaks > every single translated string... :) There is similar technique called "pseudolocalization", meant for testing i18n aspect of software. In one of most common forms, the string Edit program settings woukd be translated to [!!! εÐiţ Þr0ģЯãm səTτıИğ§ !!!] (possibly using mirrored locale, i.e. right-to-left order). The brackets [!!! ... !!!] are used as a "poison", to detect translatable text, and to spot issues with truncation; it also helps with finding "lego" translation. It would also stress-test Unicode handling... Regards, -- Jakub Narębski