Re: [TDF infra announce/request] reduce downstream traffic by 50× on git clones with `git config protocol.version 2`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 04 Jul 2020 at 06:26:05 +0200, Guilhem Moulin wrote:
> TL;DR: run `git config protocol.version 2` in your local clones of the
> core and online repositories.  That should reduce downstream traffic by
> over 50x (from 9MiB to 150kiB) in no-op `git fetch` commands.
> Client-side this only applies to git ≥2.18.0.

Quick follow up to summarize some IRC discussions:

 * `git push` commands have been reported to be much slower and occasionally
   yield an upload of several MiB (even for single patches) with version 2 of
   the wire protocol.  AFAICT this is not a regression, the new wire protocol
   has little to no effect on pushes.  Please make sure to consistently call
   `git fetch $REMOTE` immediately before `git push $REMOTE` (on the same
   remote).

   To avoid having to decrypt your SSH key twice, you can setup the OpenSSH
   authentication agent, or simply use an anonymous scheme for fetches.
   Assuming the remote name to use for pushing is ‘logerrit’, the following
   should do:

       git config remote.logerrit.pushurl ssh://logerrit/core
       git config remote.logerrit.url https://git.libreoffice.org/core

   Fetching from a different remote name does *not* help (even when it has the
   same URL).  The problem here is that the client uses git-merge-base(1) to
   guess which objects are unknown to the server and thus need to be uploaded.
   If someone has pushed to the target branch since you last fetched, and if you
   don't run `git fetch` again to update ‘refs/remotes/$REMOTE/’, then git won't
   see that only few objects needs to be uploaded.  It will instead upload a
   large pack (potentially several MiB large) containing everything since the
   last matching reference (likely the last tag or branch-point).

   This can indeed be worse with the new wire protocol: with the old one
   git-merge-base(1) had 300k+ references at hand in order to find a common
   ancestor, so in practice the pack could stop before the last tag or
   branch-point.  But this is just out of pure luck, hence for my perspective is
   not a regression: please always call `git fetch $REMOTE` before `git push
   $REMOTE`.


 * Changing protocol.version on a given repository (i.e., `git config
   protocol.version 2` without the --global flag) doesn't appear to affect
   submodules.  This can be verified by running

       GIT_TRACE_PACKET=1 git -C submodule update --remote $SUBMODULE

   and the lack of protocol version number in the handshake.  To change the wire
   protocol version in submodules, you'll need to run

       git config -f .git/modules/$SUBMODULE/config protocol.version 2
   
   or simply change it globally:

       git config --global protocol.version 2
  
   That being said, the new wire protocol really shines for remotes with a huge
   number of references, which our submodules don't have, so for these the
   improvement will be marginal (it won't hurt though).


 * git upstream had an attempt at bumping the default version of the wire
   protocol.  It was bumped to v2 in git 2.26, and demoted back to v0 in git
   2.27 as it “turned out to have some remaining rough edges.”

       https://git.kernel.org/pub/scm/git/git.git/commit/?id=684ceae32dae726c6a5c693b257b156926aba8b7
       https://git.kernel.org/pub/scm/git/git.git/commit/?id=11c7f2a30b9dadcccc7bde66a34e0cb0cb5cf52c

  It seems to work fine (and has done so for 2.5 years now) at Google for
  Chromium and gerrit though.  I guess git upstream will bump the default
  version again at some point, but given the huge performance gain I see no
  reason *not* to call `git config protocol.version 2` in core and online.

-- 
Guilhem.

PS: Please preserve the recipient list in replying.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux