On Wed, Aug 29, 2018 at 5:25 PM Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote: > > This patch helps address the CPU cost of loading the index by loading > the cache extensions on a worker thread in parallel with loading the cache > entries. > > This is possible because the current extensions don't access the cache > entries in the index_state structure so are OK that they don't all exist > yet. > > The CACHE_EXT_TREE, CACHE_EXT_RESOLVE_UNDO, and CACHE_EXT_UNTRACKED > extensions don't even get a pointer to the index so don't have access to the > cache entries. > > CACHE_EXT_LINK only uses the index_state to initialize the split index. > CACHE_EXT_FSMONITOR only uses the index_state to save the fsmonitor last > update and dirty flags. > > I used p0002-read-cache.sh to generate some performance data on the > cumulative impact: > > 100,000 entries > > Test HEAD~3 HEAD~2 > --------------------------------------------------------------------------- > read_cache/discard_cache 1000 times 14.08(0.01+0.10) 9.72(0.03+0.06) -31.0% This is misleading (if I read it correctly). 1/3 already drops execution time down to 9.81, so this patch alone only has about 6% saving. Have you measured how much time is spent on loading extensions in single threaded mode? I'm just curious if we could hide that completely (provided that we have enough cores) while we load the index. -- Duy