From: Junio C Hamano [mailto:jch2355@xxxxxxxxx] On Behalf Of Junio C Hamano > Jeff King <peff@xxxxxxxx> writes: >> On Wed, Feb 15, 2017 at 09:27:53AM -0500, Jeff Hostetler wrote: >> >>> I have some informal numbers in a spreadsheet. I was seeing >>> a 8-9% speed up on a status on my gigantic repo. >>> >>> I'll try to put together a before/after perf-test to better >>> demonstrate this. >> >> Thanks. What I'm mostly curious about is how much each individual step >> buys. Sometimes when doing a long optimization series, I find that some >> of the optimizations make other ones somewhat redundant (e.g., if patch >> 2 causes us to call the optimized code from patch 3 less often). > > I am curious too. > > To me 1/5 (reduction of redundant calls), 4/5 (correctly size the > hash that would grow to a known size anyway) and 5/5 (take advantage > of the fact that adjacent cache entries are often in the same > directory) look like no brainers to take, regardless of the others > (including themselves). agreed. > It is not clear to me if 3/5 (preload-index uses available cores to > compute hashes) is an unconditional win (an operation that is > pathspec limited may need hashes for only a small fraction of the > index---would it still be a win to compute the hash for all entries > upon loading the index, even if we are using otherwise-idel cores?). I'm not sure about pathspec cases. What I was seeing was that during the call to lazy_name_init_hash() was taking 30% of the time in "git status" and 40% in "git add <one_file>". (Again this was on my giant repo with a 450MB index). > Of course 2/5 is a prerequisite step for 3/5 and 5/5, so if we want > either of the latter two, we cannot avoid it. jeff