On 2/14/2017 5:03 PM, Jeff King wrote:
On Tue, Feb 14, 2017 at 12:31:46PM +0100, Johannes Schindelin wrote:
On Windows, calls to memihash() and maintaining the istate.name_hash and
istate.dir_hash HashMaps take significant time on very large
repositories. This series of changes reduces the overall time taken for
various operations by reducing the number calls to memihash(), moving
some of them into multi-threaded code, and etc.
Note: one commenter in https://github.com/git-for-windows/git/pull/964
pointed out that memihash() only handles ASCII correctly. That is true.
And fixing this is outside the purview of this patch series.
Out of curiosity, do you have numbers? Bonus points if the speedup can
be shown via a t/perf script.
We have a read-cache perf-test already, but I suspect you'd want
something more like "git status" or "ls-files -o" that calls into
read_directory().
I have some informal numbers in a spreadsheet. I was seeing
a 8-9% speed up on a status on my gigantic repo.
I'll try to put together a before/after perf-test to better demonstrate
this.
Jeff