Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 14, 2011 at 12:02:46PM +0200, Johan Herland wrote:

> > Wouldn't it be enough to simply create a note on 'r651235' with as
> > contents the git ref?
> 
> Not quite sure what you mean by "create a note on 'r651235'". You could 
> devise a scheme where you SHA1('r651235'), and then create a note on the 
> resulting hash.
> 
> Notes are named by the SHA1 of the object they annotate, but there is no 
> hard requirement (as long as you stay away from "git notes prune") that the 
> SHA1 annotated actually exists as a valid Git object in your repo.
> 
> Hence, you can use notes to annotate _anything_ that can be uniquely reduced 
> to a SHA1 hash.

I lean against that as a solution. I think "git gc" will probably
eventually learn to do "git notes prune", at which point we would start
losing people's data. So I think it is better to keep the definition of
notes a little tighter now, and say "the left-hand side of a notes
mapping must be a referenced object". We can always loosen it later.

On top of that, though, the sha1 solution is not all that pleasant. It
lets you do exact lookups, but you have no way of iterating over the
list of svn revisions.

I also think we can do something a little more lightweight. The user has
already created and is maintaining a mapping in one direction via the
notes. We just need the inverse mapping, which we can generate
programatically. So it can be a straight cache, with the sha1 of the
notes tree determining the cache validity (i.e., if the forward mapping
in the notes tree changes, you regenerate the cache from scratch).

We would want to store the cache in an on-disk format that could be
searched easily. Possibly something like the packed-refs format would be
sufficient, if we mmap'd and binary searched it. It would be dirt simple
if we used an existing key/value store like gdbm or tokyocabinet, but we
usually try to avoid extra dependencies.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]