Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Thu, May 30, 2013 at 10:00:23PM +0200, Thomas Rast wrote:
>
>> lookup_commit_reference_gently unconditionally parses the object given
>> to it.  This slows down git-describe a lot if you have a repository
>> with large tagged blobs in it: parse_object() will read the entire
>> blob and verify that its sha1 matches, only to then throw it away.
>> 
>> Speed it up by checking the type with sha1_object_info() prior to
>> unpacking.
>
> This would speed up the case where we do not end up looking at the
> object at all, but it will slow down the (presumably common) case where
> we will in fact find a commit and end up parsing the object anyway.
>
> Have you measured the impact of this on normal operations? During a
> traversal, we spend a measurable amount of time looking up commits in
> packfiles, and this would presumably double it.

I don't think so, but admittedly I didn't measure it.

The reason why it's unlikely is that this is specific to
lookup_commit_reference_gently, which according to some grepping is
usually done on refs or values that refs might have; e.g. on the old&new
sides of a fetch in remote.c, or in many places in the callback of some
variant of for_each_ref.

Of course if you have a ridiculously large number of refs (and I gather
_you_ do), this will hurt somewhat in the usual case, but speed up the
case where there is a ref (usually a lightweight tag) directly pointing
at a large blob.

I'm not sure this can be fixed without the change you outline here:

> This is not the first time I have seen this tradeoff in git.  It would
> be nice if our object access was structured to do incremental
> examination of the objects (i.e., store the packfile index lookup or
> partial unpack of a loose object header, and then use that to complete
> the next step of actually getting the contents).

But in any case I see the point, I should try and gather some
performance numbers.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]