Jeff Hostetler <Jeff.Hostetler@xxxxxxxxxxxxx> writes: > I looked at doing this, but I didn't think the complexity and overhead to > forward search for peers at the current level didn't warrant the limited gains. It seems that I wasn't clear what I meant. I didn't mean anything complex like what you said. Just something simple, like this on top of yours, that passes and compares with only the previous one. I do not know if that gives any gain, though ;-). cache.h | 2 +- name-hash.c | 11 +++++++++-- preload-index.c | 4 +++- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index 390aa803df..bd2980f6e3 100644 --- a/cache.h +++ b/cache.h @@ -233,7 +233,7 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif -void precompute_istate_hashes(struct cache_entry *ce); +void precompute_istate_hashes(struct cache_entry *ce, struct cache_entry *prev); /* Forward structure decls */ struct pathspec; diff --git a/name-hash.c b/name-hash.c index f95054f44c..5e09b79170 100644 --- a/name-hash.c +++ b/name-hash.c @@ -300,7 +300,7 @@ void free_name_hash(struct index_state *istate) * non-skip-worktree items (since status should not observe skipped items), but * because lazy_init_name_hash() hashes everything, we force it here. */ -void precompute_istate_hashes(struct cache_entry *ce) +void precompute_istate_hashes(struct cache_entry *ce, struct cache_entry *prev) { int namelen = ce_namelen(ce); @@ -312,7 +312,14 @@ void precompute_istate_hashes(struct cache_entry *ce) ce->precomputed_hash.root_entry = 1; } else { namelen--; - ce->precomputed_hash.dir = memihash(ce->name, namelen); + + if (prev && + prev->precomputed_hash.initialized && + namelen <= ce_namelen(prev) && + !memcmp(ce->name, prev->name, namelen)) + ce->precomputed_hash.dir = prev->precomputed_hash.dir; + else + ce->precomputed_hash.dir = memihash(ce->name, namelen); ce->precomputed_hash.name = memihash_continue( ce->precomputed_hash.dir, ce->name + namelen, ce_namelen(ce) - namelen); diff --git a/preload-index.c b/preload-index.c index 602737f9d0..784378ffac 100644 --- a/preload-index.c +++ b/preload-index.c @@ -37,6 +37,7 @@ static void *preload_thread(void *_data) struct thread_data *p = _data; struct index_state *index = p->index; struct cache_entry **cep = index->cache + p->offset; + struct cache_entry *previous = NULL; struct cache_def cache = CACHE_DEF_INIT; nr = p->nr; @@ -47,7 +48,8 @@ static void *preload_thread(void *_data) struct cache_entry *ce = *cep++; struct stat st; - precompute_istate_hashes(ce); + precompute_istate_hashes(ce, previous); + previous = ce; if (ce_stage(ce)) continue; > (I was just looking at the complexity of clear_ce_flags_1() in unpack-trees.c > and how hard it has to look to find the end of the current directory and the > effect that that has on the recursion and it felt like too much work for the > potential gain.) > > Whereas remembering the previous one was basically free. Granted, it only > helps us for adjacent files in the index, so it's not perfect, but gives us the > best bang for the buck. > > Jeff