[PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)

Ben Peart <Ben.Peart@xxxxxxxxxxxxx> · Wed, 18 Jul 2018 20:45:14 +0000

When working directories get big, checkout times start to suffer.  Even with
GVFS virtualization (which limits git to only having to update those files
that have been changed locally) we�re seeing P50 times for checkout of 31
seconds and the P80 time is 43 seconds.

Here is a checkout command with tracing turned on to demonstrate where the
time is spent.  Note, this is somewhat of a �best case� as I�m simply
checking out the current commit:

benpeart@gvfs-perf MINGW64 /f/os/src (official/rs_es_debug_dev)
$ /usr/src/git/git.exe checkout
12:31:50.419016 read-cache.c:2006       performance: 1.180966800 s: read cache .git/index
12:31:51.184636 name-hash.c:605         performance: 0.664575200 s: initialize name hash
12:31:51.200280 preload-index.c:111     performance: 0.019811600 s: preload index
12:31:51.294012 read-cache.c:1543       performance: 0.094515600 s: refresh index
12:32:29.731344 unpack-trees.c:1358     performance: 33.889840200 s: traverse_trees
12:32:37.512555 read-cache.c:2541       performance: 1.564438300 s: write index, changed mask = 28
12:32:44.918730 unpack-trees.c:1358     performance: 7.243155600 s: traverse_trees
12:32:44.965611 diff-lib.c:527          performance: 7.374729200 s: diff-index
Waiting for GVFS to parse index and update placeholder files...Succeeded
12:32:46.824986 trace.c:420             performance: 57.715656000 s: git command: 'C:\git-sdk-64\usr\src\git\git.exe' checkout

Clearly, most of the time (41 seconds) is spent in the traverse_trees() code
so the question is, how can we significantly speed up that portion of the
command?

I investigated a few options with limited success:

ODB cache
=========
Since traverse_trees() hits the ODB for each tree object (of which there are
over 500K in this repo) I wrote and tested having an in-memory ODB cache
that cached all tree objects.  This resulted in a > 50% hit ratio (largely
due to the fact we traverse the tree twice during checkout) but resulted in
only a minimal savings (1.3 seconds).

Tree Graph File
===============
I also considered storing the commit tree in an alternate structure that is
faster to load/parse (ala the Commit graph) but the cache results along with
the negligible impact of running checkout back to back (thus ensuring the
objects were cached in my file system cache) made me believe this would not
result in much savings. MIDX has already helped out here given we end up
with a lot of pack files of commits and trees.

Sparse tree traversal
=====================
We�ve sped up other parts of git by taking advantage of the existing
sparse-checkout/excludes logic to limit what files git has to consider to
those that have been modified by the user locally.  I haven�t been able to
think of a way to take advantage of that with unpack-trees() as when you are
merging n commits, a change/conflict can occur in any tree object so they
must all be traversed.  If I�m missing something here and there _is_ a way
to entirely skip large parts of the tree, please let me know!  Please note
that we�re already limiting the files that git needs to update in the
working directory via sparse-checkout/excludes but the other/merge logic
still executes for the entire tree whether there are files to update or not.

Multi-threading unpack_trees()
==============================
The current model of unpack_trees() is that a single thread recursively
traverses each tree object as it comes across it.  One thought I had was to
multi-thread the traversal so that each tree object could be processed in
parallel.  To test this idea out, I wrote an unbounded
Multi-Product-Multi-Consumer queue and then wrote a
traverse_trees_parallel() function that would add any new tree objects into
the queue where they can be processed by a pool of worker threads.  Each
thread will wake up when there is work in the queue, remove a tree object,
process it adding any additional tree objects it finds.

Multi-threading anything in git is fraught with challenges as much of the
code base is not thread safe.  To make progress, I wrapped mutexes around
code paths that were not thread safe.  The end result is that I won�t
initially get much parallelization (due to mutexes around all the expensive
work) but at least I can test out the idea and resolve any other issues with
switching from a serial to a parallel implementation.  If this works out, I
can update more of the code paths to be thread safe and/or move to more fine
grained mutexes around those paths that are difficult to make thread safe.

Final thoughts
==============

The attached set of patches don�t work!  For some commands they succeed but
I�m including them only to make it explicit what I�m currently investigating.
I�d be very interested in design feedback but formatting/spelling/white
space errors are less useful at this early stage in the investigation.

When I brought up this idea with some other git contributors they mentioned
that multi threading unpack_trees() had been discussed a few years ago on
the list but that the idea was discarded.  They couldn�t remember exactly
why it was discarded and none of us have been able to find the email threads
from that earlier discussion. As a result, I decided to write up this RFC
and see if the greater git community has ideas, suggestions, or more
background/history on whether this is a reasonable path to pursue or if
there are other/better ideas on how to speed up checkout especially on large
repos.

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/a022a91ceb
Checkout: git fetch https://github.com/benpeart/git unpacktrees-v1 && git checkout a022a91ceb

Ben Peart (3):
  add unbounded Multi-Producer-Multi-Consumer queue
  add performance tracing around traverse_trees() in unpack_trees()
  Add initial parallel version of unpack_trees()

 Makefile       |   1 +
 cache.h        |   1 +
 config.c       |   5 +
 environment.c  |   1 +
 mpmcqueue.c    |  49 ++++++++
 mpmcqueue.h    |  80 +++++++++++++
 unpack-trees.c | 314 ++++++++++++++++++++++++++++++++++++++++++++++++-
 unpack-trees.h |  30 +++++
 8 files changed, 480 insertions(+), 1 deletion(-)
 create mode 100644 mpmcqueue.c
 create mode 100644 mpmcqueue.h

base-commit: e3331758f12da22f4103eec7efe1b5304a9be5e9
-- 
2.17.0.gvfs.1.123.g449c066