Re: [PATCH v2] read_index_from(): Skip verification of the cache entry order to speed index loading

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10/30/2017 8:33 PM, Alex Vandiver wrote:
On Mon, 30 Oct 2017, Jeff King wrote:
On Mon, Oct 30, 2017 at 08:48:48AM -0400, Ben Peart wrote:

Any updates or thoughts on this one?  While the patch has become quite
trivial, it does results in a savings of 5%-15% in index load time.

I like the general direction of avoiding the check during each read.

Same -- the savings here are well worth it, IMHO.

I thought the compromise of having this test only run when DEBUG is defined
should limit it to developer builds (hopefully everyone developing on git is
running DEBUG builds :)).  Since the test is trying to detect buggy code
when writing the index, I thought that was the right time to test/catch any
issues.

I certainly don't build with DEBUG. It traditionally hasn't done
anything useful. But I'm also not convinced that this is a likely way to
find bugs in the first place, so I'm OK missing out on it.

I also don't compile with DEBUG -- there's no documentation that
mentions it, and I don't think I'd considered going poking for what
was `#ifdef`d.  I think it'd be reasonable to provide some
configure-time setting that adds `CFLAGS="-ggdb3 -O0 -DDEBUG"` or
similar, but that seems possibly moot for this particular change (see
below).

But what we probably _do_ need is to make sure that "git fsck" would
detect such an out-of-order index. So that developers and users alike
can diagnose suspected problems.

Agree -- that seems like a better home for this logic.

That is how version 1 of this patch worked but the feedback to that patch was to remove it "not only during the normal operation but also in fsck."


I am working on other, more substantial savings for index load times
(stay tuned) but this seemed like a small simple way to help speed
things up.

I'm interested to hear more about what direction you're looking in here.
  - Alex


I'm working on parallelizing the index load process across multiple threads/cpu cores. Specifically the loop that calls create_from_disk() and set_index_entry(). The serial nature of the index formats makes that difficult but I believe I've come up with a way to make it work across all existing index formats.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux