Re: [PATCH v3 0/9] Incremental po/git.pot update and new l10n workflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 23, 2022 at 10:56 PM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
>
> On Mon, May 23 2022, Jiang Xin wrote:
>
> > On Mon, May 23, 2022 at 4:19 PM Ævar Arnfjörð Bjarmason
> >>  $(LOCALIZED_SH_GEN_PO): .build/pot/po/%.po: %
> >>         $(call mkdir_p_parent_template)
> >> @@ -2786,11 +2780,24 @@ sed -e 's|charset=CHARSET|charset=UTF-8|' \
> >>  echo '"Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\\n"' >>$@
> >>  endef
> >>
> >> -.build/pot/git.header: $(LOCALIZED_ALL_GEN_PO)
> >> +.build/pot/git.header:
> >
> > No. We should rebuild the pot header if any po file need to be update,
> > because we want to refresh the timestamp in the "POT-Creation-Date:"
> > filed of the pot header.
>
> Okey, I did leave a question about this in an earlier E-Mail though,
> i.e. does anything actually rely on this, or the header at all, or is
> this just cargo-culting?
>
> I haven't found anything in our toolchain that cares about the header at
> all (for the *.pot, not *.po!) let alone the update timestamp.

When creating a new po/XX.po manually using msginit from POT file with
or without a header, the new generated po/XX.po has different header.

  $ msginit -i po/git.pot -o po/XX-with-header.po \
      --locale=ja --no-translator
  $ msginit -i po/git-headless.pot -o po/XX-without-header.po \
      --locale=ja --no-translator
  $ diff po/XX-with-header.po po/XX-without-header.po
  1,5d0
  < # Japanese translations for Git package.
  < # Copyright (C) 2022 THE Git'S COPYRIGHT HOLDER
  < # This file is distributed under the same license as the Git package.
  < # Automatically generated, 2022.
  < #
  8,11c3
  < "Project-Id-Version: Git\n"
  < "Report-Msgid-Bugs-To: Git Mailing List <git@xxxxxxxxxxxxxxx>\n"
  < "POT-Creation-Date: 2022-05-23 23:27+0800\n"
  < "PO-Revision-Date: 2022-05-23 23:27+0800\n"
  ---
  > "Project-Id-Version: git 2.36.0.7.g31429651cf.dirty\n"
  16c8
  < "Content-Type: text/plain; charset=UTF-8\n"
  ---
  > "Content-Type: text/plain; charset=ASCII\n"

Should we ignore this change?

>
> >>         $(call mkdir_p_parent_template)
> >>         $(QUIET_GEN)$(gen_pot_header)
> >>
> >> -po/git.pot: .build/pot/git.header $(LOCALIZED_ALL_GEN_PO)
> >> +# We go through this dance of having a prepared
> >> +# e.g. .build/pot/po/grep.c.po and copying it to
> >> +# .build/pot/to-cat/grep.c only because some IDEs (e.g. VSCode) pick
> >> +# up on the "real" extension for the purposes of auto-completion, even
> >> +# if the .build directiory is in .gitignore.
> >> +LOCALIZED_ALL_GEN_TO_CAT = $(LOCALIZED_ALL_GEN_PO:.build/pot/po/%.po=.build/pot/to-cat/%)
> >> +ifdef AGGRESSIVE_INTERMEDIATE
> >> +.INTERMEDIATE: $(LOCALIZED_ALL_GEN_TO_CAT)
> >> +endif
> >> +$(LOCALIZED_ALL_GEN_TO_CAT): .build/pot/to-cat/%: .build/pot/po/%.po
> >> +       $(call mkdir_p_parent_template)
> >> +       $(QUIET_GEN)cat $< >$@
> >
> > Copy each po file in ".build/pot/po/" to another location
> > ".build/pot/to-cat/", but without the ".po" extension.
> >
> > Let's take "date.c" as an example:
> >
> > 1. Copy "date.c" to an intermediate C source file
> > ".build/pot/po-munged/date.c" and replace PRItime with PRIuMAX in it.
> >
> > 2. Call xgettext to create  ".build/pot/po/date.c.po" from the
> > intermediate C source file ".build/pot/po-munged/date.c".
> >
> > 3. Optionally remove intermediate C source files like
> > ".build/pot/po-munged/date.c". To have two identical C source files in
> > the same worktree is not good, some software may break. So I choose to
> > remove them.
> >
> > 4. Copy the po file (".build/pot/po/date.c.po") created in step 2 to
> > an intermediate fake C source file ".build/pot/to-cat/date.c" which is
> > a file without the ".po" extension. Please note this intermediate fake
> > C source file ".build/pot/to-cat/date.c" is not a valid C file, but a
> > PO file.
> >
> > 5. Call msgcat to create "po/git.pot" from all the intermediate fake C
> > source files including  ".build/pot/to-cat/date.c".
> >
> > 6. Optionally remove all the intermediate fake C source files in
> > ".build/pot/to-cat/". I choose to remove them, because leave lots of
> > invalid C source files in worktree is not good.
> >
> > For example, ".build/pot/po/date.c.po" was created from
> >> +
> >> +po/git.pot: .build/pot/git.header $(LOCALIZED_ALL_GEN_TO_CAT)
> >>         $(QUIET_GEN)$(MSGCAT) $(MSGCAT_FLAGS) $^ >$@
> >
> > 7. "po/git.pot" depends on the intermediate fake C source files. If
> > any single C source file has been changed, will run step 6 to copy all
> > po files in ".build/pot/po" to corresponding fake C source files in
> > ".build/pot/to-cat/", if we choose to remove these intermediate fake C
> > source files.
> >
> > This implementation is too heavy to solve a trivial issue. I think we
> > can push forward this patch series and leave these comments in
> > "po/git.pot":
>
> If you find it too "heavy" & are trying to optimize it for some reason
> then that whole extra special-dance can be made conditional on
> MAKE_AVOID_REAL_EXTENSIONS_IN_GITIGNORED_FILES.
>
> But really, it's 15MB of .build/pot in my local HEAD with this fix-up,
> it's 1.4MB without it, but this whole thing just seems like premature
> optimization. Especially given:
>
>     $ git hyperfine -r 3 -L rev origin/master,HEAD~,HEAD,avar/Makefile-incremental-po-git-pot-rule~,avar/Makefile-incremental-po-git-pot-rule -p 'git clean -dxf; git reset --hard' 'make pot' --warmup 1
>     Benchmark 1: make pot' in 'origin/master
>       Time (mean ± σ):      1.970 s ±  0.014 s    [User: 1.683 s, System: 0.353 s]
>       Range (min … max):    1.955 s …  1.982 s    3 runs
>
>     Benchmark 2: make pot' in 'HEAD~
>       Time (mean ± σ):     931.3 ms ±   4.7 ms    [User: 3358.5 ms, System: 1088.7 ms]
>       Range (min … max):   927.0 ms … 936.3 ms    3 runs
>
>     Benchmark 3: make pot' in 'HEAD
>       Time (mean ± σ):      1.506 s ±  0.389 s    [User: 4.655 s, System: 1.363 s]
>       Range (min … max):    1.257 s …  1.955 s    3 runs
>
>     Benchmark 4: make pot' in 'avar/Makefile-incremental-po-git-pot-rule~
>       Time (mean ± σ):      1.015 s ±  0.002 s    [User: 3.615 s, System: 1.224 s]
>       Range (min … max):    1.013 s …  1.017 s    3 runs
>
>     Benchmark 5: make pot' in 'avar/Makefile-incremental-po-git-pot-rule
>       Time (mean ± σ):      1.014 s ±  0.008 s    [User: 3.540 s, System: 1.068 s]
>       Range (min … max):    1.007 s …  1.023 s    3 runs
>
>     Summary
>       'make pot' in 'HEAD~' ran
>         1.09 ± 0.01 times faster than 'make pot' in 'avar/Makefile-incremental-po-git-pot-rule'
>         1.09 ± 0.01 times faster than 'make pot' in 'avar/Makefile-incremental-po-git-pot-rule~'
>         1.62 ± 0.42 times faster than 'make pot' in 'HEAD'
>         2.12 ± 0.02 times faster than 'make pot' in 'origin/master'
>
> I.e. all of this is much faster than what we have on "master" now. My
> 22434ef36ae (Makefile: avoid "sed" on C files that don't need it,
> 2022-04-08) (avar/Makefile-incremental-po-git-pot-rule) is then just 10%
> slower than the "grep or xgettext", its "~" is the corresponding
> unoptimized.
>
> The HEAD here is with my fix-up, and HEAD~ is your series here.
>
> Anyway, if you really feel strongly about it let's go with your way of
> doing it.
>
> It just sounded like you weren't actually trying top optimize anything,
> but to work around your editor. So if we had a method to do that....

But for users who prefer to delete all the intermediate files in
".build/pot/to-cat" and ".build/pot/po-munged/", then they will get
performance penalty.

> ...except it seems you also care about making it much faster than
> "master" (or care about <20MB of disk space), which to be blunt seems a
> bit crazy to me :) Last I checked "make test" ended up creating ~1GB of
> data (not all at once, but in parallel testing a lot more than 10MB is
> often in play at once).
>
> As this was a pretty obscure target that I only expect CI, you,
> translators & me to run in practice a small difference in the initial
> run didn't seem to matter, especially as it's all an improvement over
> "master".
>
> Anyway, you do whatever you think is best with that :)
>
> >>         $ grep '#-#' po/git.pot
> >>         #. #-#-#-#-#  git-add--interactive.perl.po  #-#-#-#-#
> >>         #. #-#-#-#-#  add-patch.c.po  #-#-#-#-#
> >>         #. #-#-#-#-#  git-add--interactive.perl.po  #-#-#-#-#
> >>         #. #-#-#-#-#  branch.c.po  #-#-#-#-#
> >>         #. #-#-#-#-#  object-name.c.po  #-#-#-#-#
> >>         #. #-#-#-#-#  grep.c.po  #-#-#-#-#
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux