Re: [PATCH] Poison gettext with the Ook language

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
> On Mon, Oct 22, 2018 at 05:36:33PM +0200, Nguyễn Thái Ngọc Duy wrote:
>>
>> The current gettext() function just replaces all strings with
>> '# GETTEXT POISON #' including format strings and hides the things
>> that we should be allowed to grep (like branch names, or some other
>> codes) even when gettext is poisoned.
>> 
>> This patch implements the poisoned _() with a universal and totally
>> legit language called Ook [1]. We could actually grep stuff even in
>> with this because format strings are preserved.
>
> Once upon a time a GETTEXT_POISON build job failed on me, and the
> error message:
>
>   error: # GETTEXT POISON #
>
> was not particularly useful.  Ook wouldn't help with that...
>
> So I came up with the following couple of patches that implement a
> "scrambled" format that makes the poisoned output legible for humans
> but still gibberish for machine consumption (i.e. grep-ing the text
> part would still fail):
>
>   error:  U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File exists...
>
> I have been running GETTEXT_POISON builds with this series for some
> months now, but haven't submitted it yet, because I haven't decided
> yet whether including strings (paths, refs, etc.) in the output as
> they are is a feature or a flaw.  And because it embarrassingly leaks
> every single translated string... :)

There is similar technique called "pseudolocalization", meant for
testing i18n aspect of software.

In one of most common forms, the string

  Edit program settings

woukd be translated to

  [!!! εÐiţ Þr0ģЯãm səTτıИğ§ !!!]

(possibly using mirrored locale, i.e. right-to-left order).

The brackets [!!! ... !!!] are used as a "poison", to detect
translatable text, and to spot issues with truncation; it also helps
with finding "lego" translation.

It would also stress-test Unicode handling...

Regards,
-- 
Jakub Narębski





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux