Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 10, 2011 at 12:59:47PM +0900, NAKAMURA Takumi wrote:

> 2011/6/10 Stephen Bash <bash@xxxxxxxxxxx>:
> > I've seen two different workflows develop:
> > Â1) Hacking on some code in Git the programmer finds something wrong. ÂUsing Git tools he can pickaxe/bisect/etc. and find that the problem traces back to a commit imported from Subversion.
> > Â2) The programmer finds something wrong, asks coworker, coworker says "see bug XYZ", bug XYZ says "Fixed in r20356".
> >
> > I agree notes is the right answer for (1), but for (2) you really want a cross reference table from Subversion rev number to Git commit.
> 
> It is the point I wanted to say, thank you! I am working with svn-men.
> They often speak svn revision number. (And I have to tell them svn
> revs then)

Yeah, there is no simple way to do the bi-directional mapping in git.
If all you want are decorations on commits, notes are definitely the way
to go. They are optimized for lookup in of commit -> data. But if you
want data -> commit, the only mapping we have is refs, and they are not
well optimized for the many-refs use case.

Packed-refs are better than loose refs, but I think right now we just
load them all in to an in-memory linked list. We could load them into a
more efficient in-memory data structure, or we could perhaps even mmap
the packed-refs file and binary search it in place.

But lookup is only part of the problem. There are algorithms that want
to look at all the refs (notably fetching and pushing), which are going
to be a bit slower. We don't have a way to tell those algorithms that
those refs are uninteresting for reachability analysis, because they are
just pointing to parts of the graph that are already reachable by
regular refs. Maybe there could be a part of the refs namespace that is
ignored by "--all". I dunno. That seems like a weird inconsistency.

> FYI, I have tweaked git-rev-list for commits not to sort by date with
> --quiet. It improves git-fetch (git-rev-list --not --all) performance
> when objects is well-packed.

I'm not sure that is a good solution. Even with --quiet, we will be
walking the commit graph to find merge bases to see if things are
connected. The walking code expects date-sorting; I'm not sure what
changing that assumption will do to the code.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]