Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 09, 2011 at 08:56:50AM -0700, Shawn O. Pearce wrote:

> A lot of operations toss every commit that a reference points at into
> the revision walker's LRU queue. If you have a tag pointing to every
> commit, then the entire project history enters the LRU queue at once,
> up front. That queue is managed with O(N^2) insertion time. And the
> entire queue has to be filled before anything can be output.

We ran into this recently at github. Since our many-refs repos were
mostly forks, we had a lot of duplicate commits, and were able to solve
it with ea5f220 (fetch: avoid repeated commits in mark_complete,
2011-05-19).

However, I also worked up a faster priority queue implementation that
would work in the general case:

  http://thread.gmane.org/gmane.comp.version-control.git/174003/focus=174005

I suspect it would speed up the original poster's slow fetch. The
problem is that a fast priority queue doesn't have quite the same access
patterns as a linked list, so replacing all of the commit_lists in git
with the priority queue would be quite a painful undertaking. So we are
left with using the fast queue only in specific hot-spots.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]