On 10/30/2017 8:33 PM, Alex Vandiver wrote:
On Mon, 30 Oct 2017, Jeff King wrote:
On Mon, Oct 30, 2017 at 08:48:48AM -0400, Ben Peart wrote:
Any updates or thoughts on this one? While the patch has become quite
trivial, it does results in a savings of 5%-15% in index load time.
I like the general direction of avoiding the check during each read.
Same -- the savings here are well worth it, IMHO.
I thought the compromise of having this test only run when DEBUG is defined
should limit it to developer builds (hopefully everyone developing on git is
running DEBUG builds :)). Since the test is trying to detect buggy code
when writing the index, I thought that was the right time to test/catch any
issues.
I certainly don't build with DEBUG. It traditionally hasn't done
anything useful. But I'm also not convinced that this is a likely way to
find bugs in the first place, so I'm OK missing out on it.
I also don't compile with DEBUG -- there's no documentation that
mentions it, and I don't think I'd considered going poking for what
was `#ifdef`d. I think it'd be reasonable to provide some
configure-time setting that adds `CFLAGS="-ggdb3 -O0 -DDEBUG"` or
similar, but that seems possibly moot for this particular change (see
below).
But what we probably _do_ need is to make sure that "git fsck" would
detect such an out-of-order index. So that developers and users alike
can diagnose suspected problems.
Agree -- that seems like a better home for this logic.
That is how version 1 of this patch worked but the feedback to that
patch was to remove it "not only during the normal operation but also in
fsck."
I am working on other, more substantial savings for index load times
(stay tuned) but this seemed like a small simple way to help speed
things up.
I'm interested to hear more about what direction you're looking in here.
- Alex
I'm working on parallelizing the index load process across multiple
threads/cpu cores. Specifically the loop that calls create_from_disk()
and set_index_entry(). The serial nature of the index formats makes
that difficult but I believe I've come up with a way to make it work
across all existing index formats.