Re: [PATCH] Poison gettext with the Ook language

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 24 Oct 2018 19:54:03 +0200

On Wed, Oct 24 2018, Duy Nguyen wrote:

> On Tue, Oct 23, 2018 at 6:45 PM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>> >> The effect of what I'm suggesting here, and which my WIP patch in
>> >> <875zxtd59e.fsf@xxxxxxxxxxxxxxxxxxx> implements is that we'd do a
>> >> one-time getenv() for each process that prints a _() message that we
>> >> aren't doing now, and for each message call a function that would check
>> >> a boolean "are we in poison mode" static global variable.
>> >
>> > Just don't do the getenv() inside _() even if you just do it one time.
>> > getenv() may invalidate whatever value from the previous call. I would
>> > not want to find out my code broke because of an getenv() inside some
>> > innocent _()...
>>
>> How would any code break? We have various getenv("GIT_TEST_*...")
>> already that work the same way. Unless you set that in the environment
>> the function is idempotent, and I don't see how anyone would do that
>> accidentally.
>
> I didn't check those GIT_TEST_ but I think the difference is in
> complexity. When you write
>
>  var = getenv("foo");
>  complexFunction();
>
> you probably start to feel scary and wrap getenv() with a strdup()
> because you usually don't know exactly what complexFunction() can call
> (and even if you do, you can't be sure what it may call in the
> future).
>
> The person who writes
>
>  printf(_("%s"), getenv("foo"));
>
> may not go through the same thought process as with complexFunction().
> If _() calls getenv(), because you the order of parameter evaluation
> is unspecified, you cannot be sure if getenv("foo") will be called
> before or after the one inside _(). One of them may screw up the
> other.

Ah, you mean because getenv() is not reentrant such a construct may
cause us to return something else entirely for getenv("bar") (e.g. in
this case the value for getenv("bar")).

Is that something you or anyone else has seen in practice? My intuition
is that while POSIX doesn't make that promise it isn't likely that
there's any implementation that would mutate the env purely on a
getenv(), but setenv() (munging some static "env" area in-place) might
be another matter.

But we could always call use_gettext_poison() really early from
git_setup_gettext() (called from our main()) if we're worried about
this, it would then set the static "poison_requested" variable and we
wouldn't call getenv() again, e.g. if we're later calling it in the
middle of multithreaded code, or within the same same sequence point.

>> > And we should still respect NO_GETTEXT, not doing any extra work when
>> > it's defined.
>>
>> This is not how any of our NO_* defines work. *Sometimes* defining them
>> guarantees you do no extra work, but in other cases we've decided that
>> bending over backwards with #ifdef in various places isn't worth it.
>
> I guess it boils down to "worth it". For me #ifdef NO_GETTEXT would be
> limited to gettext.h and it's not that much work. But you do the
> actual work. You decide.

Yeah the ifdef is pretty small and not really worth worrynig about, the
main benefit is being able to run tests in this mode without
recompiling.

I hadn't been running with GETTEXT_POISON when I build git because I
hadn't been bothered to build twice just for it, but am now running it
alongside other GIT_TEST_* modes.