Re: [RFC/PATCH 1/5] gettext: fix bug in git-sh-i18n's eval_gettext() by using envsubst(1)

Ãvar ArnfjÃrÃ Bjarmason <avarab@xxxxxxxxx> · Tue, 9 Nov 2010 10:35:26 +0100

On Tue, Nov 9, 2010 at 08:33, Johannes Sixt <j.sixt@xxxxxxxxxxxxx> wrote:
> Am 11/8/2010 23:39, schrieb Ãvar ArnfjÃrÃ Bjarmason:
>> Â Â eval_gettext () {
>> Â Â Â gettext "$1" | (export PATH `envsubst --variables "$1"`; envsubst "$1")
>> Â Â }
>
> So, for every message printed, you have at least 3 fork()s (usually even
> more)! I'm not happy about that. You *must* avoid this at least for
> NO_GETTEXT builds, but if you can reduce them even for no-NO_GETTEXT
> builds, it would be great.

Why is that a "*must*"? For the GNU gettext versions (our will be
faster, if anything):

    $ time (for i in {1..1000}; do gettext "foobar"; done) >/dev/null
    real    0m3.219s
    user    0m0.253s
    sys     0m2.570s

    $ time (for i in {1..1000}; do eval_gettext "foobar"; done) >/dev/null
    real    0m12.615s
    user    0m1.264s
    sys     0m12.384s

So that's around 0.003 seconds and 0.01 seconds per message for
gettext() and eval_gettext() respectively.

I'm not indifferent to that slight cost, but (almost?) all of the
eval_gettext messages we have are just printing out an error message
before we die. None of them are inside a tight loop. This is the
typical use case:

    git-am.sh:    eval_gettext "When you have resolved this problem
run \"\$cmdline --resolved\".
    git-am.sh:                      clean_abort "$(eval_gettext "Patch
format \$patch_format is not supported.")"
    git-am.sh:      die "$(eval_gettext "previous rebase directory
\$dotest still exists but mbox given.")"
    git-am.sh:              die "$(eval_gettext "Dirty index: cannot
apply patches (dirty: \$files)")"
    git-am.sh:                      eval_gettext "Patch is empty.  Was
it split wrong?
    git-am.sh:      say "$(eval_gettext "Applying: \$FIRSTLINE")"
    git-am.sh:              eval_gettext 'Patch failed at $msgnum
$FIRSTLINE'; echo
    git-bisect.sh:                  die "$(eval_gettext "'\$arg' does
not appear to be a valid revision")"
    git-bisect.sh:          *)              die "$(eval_gettext "Bad
bisect_write argument: \$state")" ;;
    git-bisect.sh:                revs=$(git rev-list "$arg") || die
"$(eval_gettext "Bad rev input: \$arg")" ;;
    git-bisect.sh:                          die "$(eval_gettext "Bad
rev input: \$rev")"
    git-bisect.sh:         die "$(eval_gettext "'\$invalid' is not a
valid commit")"
    git-bisect.sh:  test -r "$file" || die "$(eval_gettext "cannot
read \$file for replaying")"
    git-bisect.sh:      eval_gettext "running \$command"; echo
    git-bisect.sh:    echo >&2 "$(eval_gettext "bisect run failed:
    git-bisect.sh:    echo >&2 "$(eval_gettext "bisect run failed:

The cost for that is going to be much less than the time we spend on
forking out to sed, grep and other similar utilities inside our shell
scripts. If eval_gettext() is slowing things down noticeably that's
probably a sign that we need to rewrite the script in C, not
micro-optimize the eval_gettext() implementation.

But maybe you have reason to think otherwise? I haven't noticed any
noticable slowdowns from doing it this way, but maybe I've been
looking at the wrong thing.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html