> Could we do a cache of the refs that stores the stat information for > each of the files under .git/refs plus the sha1 that the ref points > to? In other words this cache would do for the refs what the index > does for the working directory. Reading all the refs would mean we > still had to stat each of the files, but that's much quicker than > reading them in the cold-cache case. In the common case when most of > the stat information matches, we don't have to read the file because > we have the sha1 that the file contains right there in the cache. Well, that could save one of two seeks, but that's not *much* quicker. (Indeed, a git ref would fit into the 60 bytes of block pointer space in an ext2/3 inode if regular files were stuffed there as well as symlinks.) > Ideally we would have two sha1 values in the cache - the sha1 in the > file, and if that is the ID of a tag object, we would also put the > sha1 of the commit that the tag points to in the cache. Now that's not a bad idea. Hacking it in to Linus's scheme, that's <foo sha>\t<foo^{} sha>\tfoo A couple of thoughts: 1) I bet Hans Reiser is enjoying this; he's been agitating for better lots-of-small-files support for years. 2) Since I've written about two caches in a few minutes (here and in git-rev-list), a standardized cache validation hook for git-fsck-objects and git-prune's use might be useful. 3) If we use Linus's idea of a flat "static refs" file overridden by loose refs (presumably, refs would be stuffed in if their mod times got old enough, and on initial import you'd use the timestamp of the commit they point to), we'll have to do a bit of a dance to move refs to and from it. Basically, to move refs into the refs file, it's - Read all the old refs and loose refs and write the new refs file. - Rename the new refs file into place. - For each loose ref moved in, lock it, verify it hasn'd changed, and delete it. with some more locking to prevent two people from doing this at once. Folks looking up tags will do an FS search, then validate their refs file cache, then if necessary, suck in the refs file. Now, exploding a refs file into loose refs is tricky. There's the possible race condition with a reader: A: Looks for loose ref "foo", doesn't find it. B: Write out loose ref "foo" B: Deletes now-unpacked refs file A: Looks for refs file, doesn't find it. A: Concludes that ref "foo" doesn't exist. The only solution I can think of is to stat the refs file at the start of the operation and restart from the beginning if it changes by the time it actually opens and read it. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html