Re: Poor performance of git describe in big repos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30 May 2013 16:33, Thomas Rast <trast@xxxxxxxxxxx> wrote:
> Alex Bennée <kernel-hacker@xxxxxxxxxx> writes:
>
>>  41.58%   git  libcrypto.so.1.0.0  [.] sha1_block_data_order_ssse3
>>  33.62%   git  libz.so.1.2.3.4     [.] inflate_fast
>>  10.39%   git  libz.so.1.2.3.4     [.] adler32
>>   2.03%   git  [kernel.kallsyms]   [k] clear_page_c
>
> Do you have any large blobs in the repo that are referenced directly by
> a tag?

Most probably. I've certainly done a bunch of releases (which are tagged) were
the last thing that was updated was an FPGA image.

> Because this just so happens to exactly reproduce your symptoms:
>
>   # in a random git.git
>   $ time git describe --debug
>   [...]
>   real    0m0.390s
>   user    0m0.037s
>   sys     0m0.011s
>   $ git tag big1 $(dd if=/dev/urandom bs=1M count=512 | git hash-object -w --stdin)
>   512+0 records in
>   512+0 records out
>   536870912 bytes (537 MB) copied, 45.5088 s, 11.8 MB/s
>   $ time git describe --debug
>   [...]
>   real    0m1.875s
>   user    0m1.738s
>   sys     0m0.129s
>   $ git tag big2 $(dd if=/dev/urandom bs=1M count=512 | git hash-object -w --stdin)
>   512+0 records in
>   512+0 records out
>   536870912 bytes (537 MB) copied, 44.972 s, 11.9 MB/s
>   $ time git describe --debugsuche zur Beschreibung von HEAD
>   [...]
>   real    0m3.620s
>   user    0m3.357s
>   sys     0m0.248s
>
> (I actually ran the git-describe invocations more than once to ensure
> that they are again cache-hot.)

That looks pretty promising as a replication.

> git-describe should probably be fixed to avoid loading blobs, though I'm
> not sure off hand if we have any infrastructure to infer the type of a
> loose object without inflating it.  (This could probably be added by
> inflating only the first block.)  We do have this for packed objects, so
> at least for packed repos there's a speedup to be had.

Will it be loading the blob for every commit it traverses or just ones that hit
a tag? Why does it need to load the blob at all? Surely the commit
tree state doesn't
need to be walked down?

>
> --
> Thomas Rast
> trast@{inf,student}.ethz.ch



-- 
Alex, homepage: http://www.bennee.com/~alex/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]