Re: [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 23, 2018 at 5:50 PM Ben Peart <peartben@xxxxxxxxx> wrote:
> > Anyway, on to the actual discussion:
> >
> >> Here is a checkout command with tracing turned on to demonstrate where the
> >> time is spent.  Note, this is somewhat of a �best case� as I�m simply
> >> checking out the current commit:
> >>
> >> benpeart@gvfs-perf MINGW64 /f/os/src (official/rs_es_debug_dev)
> >> $ /usr/src/git/git.exe checkout
> >> 12:31:50.419016 read-cache.c:2006       performance: 1.180966800 s: read cache .git/index
> >> 12:31:51.184636 name-hash.c:605         performance: 0.664575200 s: initialize name hash
> >> 12:31:51.200280 preload-index.c:111     performance: 0.019811600 s: preload index
> >> 12:31:51.294012 read-cache.c:1543       performance: 0.094515600 s: refresh index
> >> 12:32:29.731344 unpack-trees.c:1358     performance: 33.889840200 s: traverse_trees
> >> 12:32:37.512555 read-cache.c:2541       performance: 1.564438300 s: write index, changed mask = 28
> >> 12:32:44.918730 unpack-trees.c:1358     performance: 7.243155600 s: traverse_trees
> >> 12:32:44.965611 diff-lib.c:527          performance: 7.374729200 s: diff-index
> >> Waiting for GVFS to parse index and update placeholder files...Succeeded
> >> 12:32:46.824986 trace.c:420             performance: 57.715656000 s: git command: 'C:\git-sdk-64\usr\src\git\git.exe' checkout
> >
> > What's the current state of the index before this checkout?
>
> This was after running "git checkout" multiple times so there was really
> nothing for git to do.

Hmm.. this means cache-tree is fully valid, unless you have changes in
index. We're quite aggressive in repairing cache-tree since aecf567cbf
(cache-tree: create/update cache-tree on checkout - 2014-07-05). If we
have very good cache-tree records and still spend 33s on
traverse_trees, maybe there's something else.

> >> ODB cache
> >> =========
> >> Since traverse_trees() hits the ODB for each tree object (of which there are
> >> over 500K in this repo) I wrote and tested having an in-memory ODB cache
> >> that cached all tree objects.  This resulted in a > 50% hit ratio (largely
> >> due to the fact we traverse the tree twice during checkout) but resulted in
> >> only a minimal savings (1.3 seconds).
> >
> > In my experience, one major cost of object access is decompression, both
> > delta and zlib. Trees in particular tend to delta very well across
> > versions. We have a cache to try to reuse intermediate delta results,
> > but the default size is probably woefully undersized for your repository
> > (I know from past tests it's undersized a bit even for the linux
> > kernel).
> >
> > Try bumping core.deltaBaseCacheLimit to see if that has any impact. It's
> > 96MB by default.
> >
> > There may also be some possible work in making it more aggressive about
> > storing the intermediate results. I seem to recall from past
> > explorations that it doesn't keep everything, and I don't know if its
> > heuristics have ever been proven sane.

Could we be a bit more flexible about cache size? Say if we know
there's 8 GB memory still available, we should be able to use like 1
GB at least (and that's done automatically without tinkering with
config).
-- 
Duy




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux