On 8/23/2018 1:31 PM, Stefan Beller wrote:
On Thu, Aug 23, 2018 at 8:45 AM Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
This patch helps address the CPU cost of loading the index by creating
multiple threads to divide the work of loading and converting the cache
entries across all available CPU cores.
It accomplishes this by having the primary thread loop across the index file
tracking the offset and (for V4 indexes) expanding the name. It creates a
thread to process each block of entries as it comes to them. Once the
threads are complete and the cache entries are loaded, the rest of the
extensions can be loaded and processed normally on the primary thread.
Performance impact:
read cache .git/index times on a synthetic repo with:
100,000 entries
FALSE TRUE Savings %Savings
0.014798767 0.009580433 0.005218333 35.26%
1,000,000 entries
FALSE TRUE Savings %Savings
0.240896533 0.1751243 0.065772233 27.30%
read cache .git/index times on an actual repo with:
~3M entries
FALSE TRUE Savings %Savings
0.59898098 0.4513169 0.14766408 24.65%
Signed-off-by: Ben Peart <Ben.Peart@xxxxxxxxxxxxx>
---
Notes:
Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/67a700419b
Checkout: git fetch https://github.com/benpeart/git read-index-multithread-v1 && git checkout 67a700419b
Documentation/config.txt | 8 ++
config.c | 13 +++
config.h | 1 +
read-cache.c | 218 ++++++++++++++++++++++++++++++++++-----
4 files changed, 216 insertions(+), 24 deletions(-)
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1c42364988..3344685cc4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do the
index comparison to the filesystem data in parallel, allowing
overlapping IO's. Defaults to true.
+core.fastIndex::
+ Enable parallel index loading
++
+This can speed up operations like 'git diff' and 'git status' especially
+when the index is very large. When enabled, Git will do the index
+loading from the on disk format to the in-memory format in parallel.
+Defaults to true.
"fast" is a non-descriptive word as we try to be fast in any operation?
Maybe core.parallelIndexReading as that just describes what it
turns on/off, without second guessing its effects?
(Are there still computers with just a single CPU, where this would not
make it faster? ;-))
How about core.parallelReadIndex? Slightly shorter and matches the
function names better.
+int git_config_get_fast_index(void)
+{
+ int val;
+
+ if (!git_config_get_maybe_bool("core.fastindex", &val))
+ return val;
+
+ if (getenv("GIT_FASTINDEX_TEST"))
+ return 1;
We look at this env value just before calling this function,
can be write it to only look at the evn variable once?
Sure, I didn't like the fact that it was called twice but didn't get
around to cleaning it up.
+++ b/config.h
@@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void);
extern int git_config_get_split_index(void);
extern int git_config_get_max_percent_split_change(void);
extern int git_config_get_fsmonitor(void);
+extern int git_config_get_fast_index(void);
Oh. nd/no-extern did not cover config.h
+#ifndef min
+#define min(a,b) (((a) < (b)) ? (a) : (b))
+#endif
We do not have a minimum function in the tree,
except for xdiff/xmacros.h:29: XDL_MIN.
I wonder what the rationale is for not having a MIN()
definition, I think we discussed that on the list a couple
times but the rationale escaped me.
If we introduce a min/max macro, can we put it somewhere
more prominent? (I would find it useful elsewhere)
I'll avoid that particular rabbit hole and just remove the min macro
definition. ;-)
+/*
+* Mostly randomly chosen maximum thread counts: we
+* cap the parallelism to online_cpus() threads, and we want
+* to have at least 7500 cache entries per thread for it to
+* be worth starting a thread.
+*/
+#define THREAD_COST (7500)
This reads very similar to preload-index.c THREAD_COST
+ /* loop through index entries starting a thread for every thread_nr entries */
+ consumed = thread = 0;
+ for (i = 0; ; i++) {
+ struct ondisk_cache_entry *ondisk;
+ const char *name;
+ unsigned int flags;
+
+ /* we've reached the begining of a block of cache entries, kick off a thread to process them */
beginning
Thanks
Thanks,
Stefan