On Thu, Aug 23, 2018 at 8:45 AM Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote: > > This patch helps address the CPU cost of loading the index by creating > multiple threads to divide the work of loading and converting the cache > entries across all available CPU cores. > > It accomplishes this by having the primary thread loop across the index file > tracking the offset and (for V4 indexes) expanding the name. It creates a > thread to process each block of entries as it comes to them. Once the > threads are complete and the cache entries are loaded, the rest of the > extensions can be loaded and processed normally on the primary thread. > > Performance impact: > > read cache .git/index times on a synthetic repo with: > > 100,000 entries > FALSE TRUE Savings %Savings > 0.014798767 0.009580433 0.005218333 35.26% > > 1,000,000 entries > FALSE TRUE Savings %Savings > 0.240896533 0.1751243 0.065772233 27.30% > > read cache .git/index times on an actual repo with: > > ~3M entries > FALSE TRUE Savings %Savings > 0.59898098 0.4513169 0.14766408 24.65% > > Signed-off-by: Ben Peart <Ben.Peart@xxxxxxxxxxxxx> > --- > > Notes: > Base Ref: master > Web-Diff: https://github.com/benpeart/git/commit/67a700419b > Checkout: git fetch https://github.com/benpeart/git read-index-multithread-v1 && git checkout 67a700419b > > Documentation/config.txt | 8 ++ > config.c | 13 +++ > config.h | 1 + > read-cache.c | 218 ++++++++++++++++++++++++++++++++++----- > 4 files changed, 216 insertions(+), 24 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1c42364988..3344685cc4 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do the > index comparison to the filesystem data in parallel, allowing > overlapping IO's. Defaults to true. > > +core.fastIndex:: > + Enable parallel index loading > ++ > +This can speed up operations like 'git diff' and 'git status' especially > +when the index is very large. When enabled, Git will do the index > +loading from the on disk format to the in-memory format in parallel. > +Defaults to true. "fast" is a non-descriptive word as we try to be fast in any operation? Maybe core.parallelIndexReading as that just describes what it turns on/off, without second guessing its effects? (Are there still computers with just a single CPU, where this would not make it faster? ;-)) > +int git_config_get_fast_index(void) > +{ > + int val; > + > + if (!git_config_get_maybe_bool("core.fastindex", &val)) > + return val; > + > + if (getenv("GIT_FASTINDEX_TEST")) > + return 1; We look at this env value just before calling this function, can be write it to only look at the evn variable once? > +++ b/config.h > @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void); > extern int git_config_get_split_index(void); > extern int git_config_get_max_percent_split_change(void); > extern int git_config_get_fsmonitor(void); > +extern int git_config_get_fast_index(void); Oh. nd/no-extern did not cover config.h > > +#ifndef min > +#define min(a,b) (((a) < (b)) ? (a) : (b)) > +#endif We do not have a minimum function in the tree, except for xdiff/xmacros.h:29: XDL_MIN. I wonder what the rationale is for not having a MIN() definition, I think we discussed that on the list a couple times but the rationale escaped me. If we introduce a min/max macro, can we put it somewhere more prominent? (I would find it useful elsewhere) > +/* > +* Mostly randomly chosen maximum thread counts: we > +* cap the parallelism to online_cpus() threads, and we want > +* to have at least 7500 cache entries per thread for it to > +* be worth starting a thread. > +*/ > +#define THREAD_COST (7500) This reads very similar to preload-index.c THREAD_COST > + /* loop through index entries starting a thread for every thread_nr entries */ > + consumed = thread = 0; > + for (i = 0; ; i++) { > + struct ondisk_cache_entry *ondisk; > + const char *name; > + unsigned int flags; > + > + /* we've reached the begining of a block of cache entries, kick off a thread to process them */ beginning Thanks, Stefan