Re: [PATCH] i18n: Not add stripped contents for translation

Junio C Hamano <gitster@xxxxxxxxx> · Sun, 04 Mar 2012 19:42:28 -0800

Jiang Xin <worldhello.net@xxxxxxxxx> writes:

> Orignal source code:
>
> 244   case WT_STATUS_CHANGED:
> 245     if (d->new_submodule_commits || d->dirty_submodule) {
> 246       strbuf_addstr(&extra, " (");
> 247       if (d->new_submodule_commits)
> 248         strbuf_addf(&extra, _("new commits, "));
> 249       if (d->dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
> 250         strbuf_addf(&extra, _("modified content, "));
> 251       if (d->dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
> 252         strbuf_addf(&extra, _("untracked content, "));
> 253       strbuf_setlen(&extra, extra.len - 2);
> 254       strbuf_addch(&extra, ')');
> 255     }
>
> The bad thing is strbuf_setlen() at line 253. We can not asume the translation
> of ", " must be 2 characters.

It sounds like you are merely working around a poor style in the original,
which should have been structured more like this in the first place, no?

        /* a helper function elsewhere, possibly inlined */
        static void add_iwsep_as_needed(struct strbuf *buf, int origlen)
        {
                if (buf->len != origlen)
                        strbuf_addstr(buf, _(","));
        }

        ...
        int origlen;

        strbuf_addstr(&extra, " ("))
        origlen = extra.len;
        if (a)
                strbuf_addstr(&extra, _("msg a"));
        if (b) {
                add_iwsep_as_needed(&extra, origlen);
                strbuf_addstr(&extra, _("msg b"));
        }
        if (c) {
                add_iwsep_as_needed(&extra, origlen);
                strbuf_addstr(&extra, _("msg c"));
        }
        strbuf_addstr(&extra, ")");

Cc'ing Jens whose 9297f77 (git status: Show detailed dirty status of
submodules in long format, 2010-03-08) introduced the "two-byte backstep".

This is a tangent and I am just showing aloud my ignorance, but I wonder
if there is a reasonably generic and "best current practice" way to
structure code to show an enumeration in human languages, for example,

	A, B, C and D.

in an easier-to-translate way.

I suspect that it might be sufficiently generic if we can make it possible
to allow the first and the last inter-word-separation and the token after
all the items to be different from other inter-word-separation tokens.

E.g. in English, the first one and all the "other" are ", ", the last
inter-word token is " and ", and the token at the very end is ".". In
Japanese some translators may want to say "AやBとCとD。", meaning the
first one is "や", "。" is used at the very end, and all the others may be
"と".

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html