Re: Why is "git fetch --prune" so much slower than "git remote prune"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 06, 2015 at 05:48:39PM +0100, Ævar Arnfjörð Bjarmason wrote:

> The --prune option to fetch added in v1.6.5-8-gf360d84 seems to be
> around 20-30x slower than the equivalent operation with git remote
> prune. I'm wondering if I'm missing something and fetch does something
> more, but it doesn't seem so.

"git fetch --prune" is "do a normal fetch, and also prune anything
necessary". "git remote prune" is "ls-remote the other side and see if
there is anything we can prune; do not touch anything else".

If your fetch is a noop (i.e., the other side has not advanced any
branches), the outcome is the same. But perhaps fetch is doing more
work to find out that it is a noop.

One way to measure that would be to see how expensive a noop "git fetch"
is (if it's expensive, then there is room to improve there. If not, then
it is the pruning itself that is expensive).

But just guessing (I do not have time to dig in deeper right now), and
seeing this:

> $ gprof ~/g/git/git-fetch|head -n 20
> Flat profile:
> 
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  26.42      0.33     0.33  1584583     0.00     0.00  strbuf_getwholeline
>  14.63      0.51     0.18 90601347     0.00     0.00  strbuf_grow
>  13.82      0.68     0.17  1045676     0.00     0.00  find_pack_entry_one
>   8.13      0.78     0.10  1050062     0.00     0.00  check_refname_format
>   6.50      0.86     0.08  1584675     0.00     0.00  get_sha1_hex
>   5.69      0.93     0.07  2100529     0.00     0.00  starts_with
>   3.25      0.97     0.04  1044043     0.00     0.00  refname_is_safe
>   3.25      1.01     0.04     8007     0.00     0.00  get_packed_ref_cache
>   2.44      1.04     0.03  2605595     0.00     0.00  search_ref_dir
>   2.44      1.07     0.03  1040500     0.00     0.00  peel_entry
>   1.63      1.09     0.02  2632661     0.00     0.00  get_ref_dir
>   1.63      1.11     0.02  1044043     0.00     0.00  create_ref_entry
>   1.63      1.13     0.02     8024     0.00     0.00  do_for_each_entry_in_dir
>   0.81      1.14     0.01  2155105     0.00     0.00  memory_limit_check
>   0.81      1.15     0.01  1580503     0.00     0.00  sha1_to_hex

We spend a lot of time checking refs here. Probably this comes from
writing the `packed-refs` file out 1000 times in your example, because
fetch handles each ref individually. Whereas since c9e768b (remote:
repack packed-refs once when deleting multiple refs, 2014-05-23),
git-remote does it in one pass.

Now that we have ref_transaction_*, I think if git-fetch fed all of the
deletes (along with the updates) into a single transaction, we would get
the same optimization for free. Maybe that is even part of some of the
pending ref_transaction work from Stefan or Michael (both cc'd). I
haven't kept up very well with what is cooking in pu.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]