Re: remove_duplicates() in builtin/fetch-pack.c is O(N^2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 21, 2012 at 06:14:17PM -0400, Jeff King wrote:

> The rails and git cases run in ~28s and ~37s, respectively, again mostly
> going to the actual object transfer. So I think this series removes all
> of the asymptotically bad behavior from this code path.
> 
> One thing to note about all of these repos is that they tend to have
> several refs pointing to a single commit. None of the speedups in this
> series depends on that fact, but it may be that on a repo with more
> independent refs, we may uncover other code paths (e.g., I know that my
> fix for mark_complete in ea5f220 improves the case with duplicate refs,
> but would not help if you really have 400K refs pointing to unique
> commits[1]).

Hmm. So I started to do some experimentation with this and noticed
something odd.

Try doing "git fetch . refs/*:refs/*" in a repository with a large
number of refs (e.g., 400K). With git v1.7.10, this takes about 9.5s on
my machine. But using the version of git in "next" takes about 16.5s.
Bisection points to your 432ad41 (refs: store references hierarchically,
2012-04-10). Perf shows sort_ref_dir and msort_with_tmp as hot spots.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]