On Sun, 10 Sep 2006, Linus Torvalds wrote: > > If we did the same pack-file approach that we do for objects, the problem > ends up being that _updating_ things is really hard. What we could do (and > might work) is that a "git repack" would create a "packed representation > of the heads too". To clarify: I'm _not_ suggesting actually using the "pack-file" representation itself for the references. I'm saying that we could have something like a .git/refs-packed file, which could be (for example) just a plain linear text-file, of the form 19ed368b17abfb9ad5c7467ea74fd8a045d96b43 refs/heads/html 60a6bf5f53635005f4f68d8b8a33172309193623 refs/heads/maint 2d32e76f893b2ac432201ce7596c8bab691691e6 refs/heads/man a41fae9c46a4cb5e59cc1a17d19f6a3a6cbfbb2f refs/heads/master 61af0aaf26e003a61848e551bbd57e78e94eacdc refs/heads/next 585729203fef6ade64923277e7151b2e3a4ca330 refs/heads/pu 997283a8e87179b5b87a909686869d7843c8e19a refs/heads/todo a0e7d36193b96f552073558acf5fcc1f10528917 refs/tags/junio-gpg-pub d6602ec5194c87b0fc87103ca4d67251c76f233a refs/tags/v0.99 f25a265a342aed6041ab0cc484224d9ca54b6f41 refs/tags/v0.99.1 c5db5456ae3b0873fc659c19fafdde22313cc441 refs/tags/v0.99.2 7ceca275d047c90c0c7d5afb13ab97efdf51bd6e refs/tags/v0.99.3 b3e9704ecdf48869f635f0aa99ddfb513f885aff refs/tags/v0.99.4 07e38db6a5a03690034d27104401f6c8ea40f1fc refs/tags/v0.99.5 f12e22d4c12c3d0263fa681f25c06569f643da0f refs/tags/v0.99.6 f8696fcd2abc446a5ccda3e414b731eff2a7e981 refs/tags/v0.99.7 1094cf40f7029f803421c1dcc971238507c830c5 refs/tags/v0.99.7a da30c6c39cd3b048952a15929c5440acfd71b912 refs/tags/v0.99.7b 9165ec17fde255a1770886189359897dbb541012 refs/tags/v0.99.7c 02b2acff8bafb6d73c6513469cdda0c6c18c4138 refs/tags/v0.99.7d ... ie it would contain just a linear file with the "<hex></tab><refname>" format. Then, the way to look up a reference would be: - look it up in the traditional loose file - if it exists, and contains zeros (or not a hex value), it's considered a "negative entry", and the branch doesn't exist - otherwise, if it's a good SHA1, that's the result - if it's not there, look it up in the ".git/refs-packed" file by just doing a simple linear scan (trivial, and actually efficient - we're talking about just a few kB of memory after all, and the _cost_ is actually the IO, where "simple linear scan" is actually very good for performance). The end result would be that we'd probably have very few loose references (we'd get them whenever we change a ref, or delete one), making the lookup scale better. The big _bulk_ of the references tend to be very stable, notably they are tags that seldom - if ever - change, and would thus stay just in the packed refs file. So the normal situation would be that you'd have a few hundred (maybe, for a bigger project) refs in the single .git/refs-packed file, totalling a few kB of disk-space, and then you might have a handful of "active" heads that are in the traditional single-file format .git/refs/<filename> because they are beign actively modified and have changed since the last repack. I bet it would work fairly well. But somebody would need to implement it. The good news is that the refs-handling code tends to be _fairly_ well abstracted out, because we already wanted that for the logging thing. So we hopefully already don't actually access the loose file objects by hand from shell scripts any more - we use git-rev-parse and git-update-ref etc to look up refnames, and that all goes back to git/refs.c. So _most_ of the bulk of it would probably be in refs.c, but there's obviously also things like git-branch.sh that needs to be taught the new rules about deleting branches etc. Anybody want to try the above rules out? I bet the _only_ real issue would be "for_each_ref()", where it's important to _first_ do the loose objects, and remember them all, so that you do _not_ show the refs that are in the .git/refs-packed file and already got shown because they were loose. NOTE! It's important that whatever sceme used gets locking right. The above suggestion gets it right simply because it doesn't really _change_ anything. Any new or modified ref ends up using the old code, and using a ".lock" file and renaming it automatically does the same thing it ever did. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html