Alex Bennée <kernel-hacker@xxxxxxxxxx> writes: > On 30 May 2013 20:30, John Keeping <john@xxxxxxxxxxxxx> wrote: >> On Thu, May 30, 2013 at 06:21:55PM +0200, Thomas Rast wrote: >>> Alex Bennée <kernel-hacker@xxxxxxxxxx> writes: >>> >>> > On 30 May 2013 16:33, Thomas Rast <trast@xxxxxxxxxxx> wrote: >>> >> Alex Bennée <kernel-hacker@xxxxxxxxxx> writes: >> <snip> >>> > Will it be loading the blob for every commit it traverses or just ones that hit >>> > a tag? Why does it need to load the blob at all? Surely the commit >>> > tree state doesn't >>> > need to be walked down? >>> >>> No, my theory is that you tagged *the blobs*. Git supports this. > > Wait is this the difference between annotated and non-annotated tags? > I thought a non-annotated just acted like references to a particular > tree state? A tag is just a ref. It can point at anything, in particular also a blob (= some file *contents*). An annotated tag is just a tag pointing at a "tag object". A tag object contains tagger name/email/date, a reference to an object, and a tag message. The slowness I found relates to having tags that point at blobs directly (unannotated). >> You can see if that is the case by doing something like this: >> >> eval $(git for-each-ref --shell --format ' >> test $(git cat-file -t %(objectname)^{}) = commit || >> echo %(refname);') >> >> That will print out the name of any ref that doesn't point at a >> commit. > > Hmm that didn't seem to work. But looking at the output by hand I > certainly have a mix of tags that are commits vs tags: > > > 09:08 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags" > | grep "commit" | wc -l > 1345 > 09:12 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags" > | grep -v "commit" | wc -l > 66 > > Unfortunately I can't just delete those tags as they do refer to known > releases which we obviously care about. If I delete the tags on my > local repo and test for a speed increase can I re-create them as > annotated tag objects? I would be more interested in this: git for-each-ref | grep ' blob' and (git for-each-ref | grep ' blob' | cut -d\ -f1 | xargs -n1 git cat-file blob) | wc -c The first tells you if you have any refs pointing at blobs. The second computes their total unpacked size. My theory is that the second yields some large number (hundreds of megabytes at least). It would be nice if you checked, because if there turn out to be big blobs, we have all the pieces and just need to assemble the best solution. Otherwise, there's something else going on and the problem remains open. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html