Re: [PATCH v3 3/9] Makefile: have "make pot" not "reset --hard"

Jiang Xin <worldhello.net@xxxxxxxxx> · Mon, 23 May 2022 17:37:02 +0800

On Mon, May 23, 2022 at 4:15 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Jiang Xin <worldhello.net@xxxxxxxxx> writes:
>
> > From: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
> >
> > Before commit fc0fd5b23b (Makefile: help gettext tools to cope with our
> > custom PRItime format, 2017-07-20) we'd consider source files as-is
>
> ", we'd consider".
>
> > with gettext, but because we need to understand PRItime in the same way
> > that gettext itself understands PRIuMAX we'd first check if we had a
>
> "PRIuMAX, we'd first"
>
> > clean checkout, then munge all of the processed files in-place with
> > "sed", generate "po/git.pot", and then finally "reset --hard" to undo
> > our changes.
> >
> > By generating "pot" snippets in ".build/pot/po" for each source file
> > and rewriting certain source files with PRItime macros to temporary
> > files in ".build/pot/po", we can avoid running "make pot" by altering
> > files in place and doing a "reset --hard" afterwards.
>
> Good.
>
> > This speed of "make pot" is slower than before on an initial run,
> > because we run "xgettext" many times (once per source file), but it
> > can be boosted by parallelization. It is *much* faster for incremental
> > runs, and will allow us to implement related targets in subsequent
> > commits.
>
> This is to show my ignorance, but is there any downside, other than
> increased overhead coming from runing many instances of the program,
> in the "one file at a time" approach?  I was wondering if two or
> more identical translatable strings appear in multiple source files,
> where they are coalesced into a single entry in the resulting .pot
> file, and if xgettext having visibility into all these files would
> somehow help the process, but presumably we'd use msgcat to unify
> them into one entry so there wouldn't be such a downside there.  But
> are there others?

Both xgettext and msgcat have the ability to unify identical
translatable strings, and list all the filenames and locations in the
reference-comment part of the entry. In the following example, we can
see the message "could not write index" comes from several C source
files. We can also find this message in the corresponding intermediate
po files, such as: ".build/pot/po/add-interactive.c.po" and
".build/pot/po/reset.c.po".

    #: add-interactive.c:709 add-interactive.c:898 reset.c:160 ...
    msgid "could not write index"
    msgstr ""

If there are different contexts (I.E. notes for translators in source
code) for the identical translatable string from different files,
"msgcat" will add additional lines "#. #-#-#-#-# ... #-#-#-#-#" for
disambiguation like below:

    +#. #-#-#-#-#  git-add--interactive.perl.po  #-#-#-#-#
     #. TRANSLATORS: 'it' refers to the patch mentioned in the
previous messages.
     #: add-patch.c:1106 git-add--interactive.perl:1129
     msgid ""
     "If it does not apply cleanly, you will be given an opportunity to\n"
     "edit again.  If all lines of the hunk are removed, then the edit is\n"
     "aborted and the hunk is left unchanged.\n"
     msgstr ""

The filename in the disambiguation comment line ends with ".po"
extension. Ævar tries to fix this issue by introducing more
intermediate files (which I will try to understand by some experiment)
or remove ".po" extension from po files, while I think it is
tolerable.