git@xxxxxxxxxxxxxxxxx writes: > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > This patch series is a performance optimization for > lazy_init_name_hash() in name-hash.c on very large > repositories. > > This change allows lazy_init_name_hash() to optionally > use multiple threads when building the the_index.dir_hash > and the_index.name_hash hashmaps. The original code path > has been preserved and is used when the repo is small or > the system does not have sufficient CPUs. > > A helper command (t/helper/test-lazy-init-name-hash) was > created to demonstrate performance differences and validate > output. For example, use the '-p' option to compare both > code paths on a large repo. > > During our testing on the Windows source tree (3.1M > files, 500K folders, 450MB index), this change reduced > the runtime of lazy_init_name_hash() from 1.4 to 0.27 > seconds. > > This patch series replaces my earlier > * jh/memihash-opt (2017-02-17) 5 commits > patch series. Ahh. I was scratching my head trying to remember why some of these look so familiar. [PATCH v2 ...] would have helped. Thank you for an update. > > Jeff Hostetler (6): > name-hash: specify initial size for istate.dir_hash table > hashmap: allow memihash computation to be continued > hashmap: Add disallow_rehash setting > name-hash: perf improvement for lazy_init_name_hash > name-hash: add test-lazy-init-name-hash > name-hash: add perf test for lazy_init_name_hash > > Makefile | 1 + > cache.h | 1 + > hashmap.c | 29 ++- > hashmap.h | 25 ++ > name-hash.c | 490 +++++++++++++++++++++++++++++++++++- > t/helper/test-lazy-init-name-hash.c | 264 +++++++++++++++++++ > t/perf/p0004-lazy-init-name-hash.sh | 19 ++ > 7 files changed, 820 insertions(+), 9 deletions(-) > create mode 100644 t/helper/test-lazy-init-name-hash.c > create mode 100644 t/perf/p0004-lazy-init-name-hash.sh