Re: Poor performance of git describe in big repos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alex Bennée <kernel-hacker@xxxxxxxxxx> writes:

> On 30 May 2013 20:30, John Keeping <john@xxxxxxxxxxxxx> wrote:
>> On Thu, May 30, 2013 at 06:21:55PM +0200, Thomas Rast wrote:
>>> Alex Bennée <kernel-hacker@xxxxxxxxxx> writes:
>>>
>>> > On 30 May 2013 16:33, Thomas Rast <trast@xxxxxxxxxxx> wrote:
>>> >> Alex Bennée <kernel-hacker@xxxxxxxxxx> writes:
>> <snip>
>>> > Will it be loading the blob for every commit it traverses or just ones that hit
>>> > a tag? Why does it need to load the blob at all? Surely the commit
>>> > tree state doesn't
>>> > need to be walked down?
>>>
>>> No, my theory is that you tagged *the blobs*.  Git supports this.
>
> Wait is this the difference between annotated and non-annotated tags?
> I thought a non-annotated just acted like references to a particular
> tree state?

A tag is just a ref.  It can point at anything, in particular also a
blob (= some file *contents*).

An annotated tag is just a tag pointing at a "tag object".  A tag object
contains tagger name/email/date, a reference to an object, and a tag
message.

The slowness I found relates to having tags that point at blobs directly
(unannotated).

>> You can see if that is the case by doing something like this:
>>
>>     eval $(git for-each-ref --shell --format '
>>         test $(git cat-file -t %(objectname)^{}) = commit ||
>>         echo %(refname);')
>>
>> That will print out the name of any ref that doesn't point at a
>> commit.
>
> Hmm that didn't seem to work. But looking at the output by hand I
> certainly have a mix of tags that are commits vs tags:
>
>
> 09:08 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags"
> | grep "commit" | wc -l
> 1345
> 09:12 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags"
> | grep -v "commit" | wc -l
> 66
>
> Unfortunately I can't just delete those tags as they do refer to known
> releases which we obviously care about. If I delete the tags on my
> local repo and test for a speed increase can I re-create them as
> annotated tag objects?

I would be more interested in this:

  git for-each-ref | grep ' blob'

and

  (git for-each-ref | grep ' blob' | cut -d\  -f1 | xargs -n1 git cat-file blob) | wc -c

The first tells you if you have any refs pointing at blobs.  The second
computes their total unpacked size.  My theory is that the second yields
some large number (hundreds of megabytes at least).

It would be nice if you checked, because if there turn out to be big
blobs, we have all the pieces and just need to assemble the best
solution.  Otherwise, there's something else going on and the problem
remains open.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]