On Tue, May 29, 2007 at 09:06:29AM +0200, Johan Herland wrote: > 1. Keep a file, ".git/reverse_tagmap_sorted" with one entry of the form > "pointee pointer" per line. The file is sorted on "pointee", so we can > easily do the radix-256-fan-out-followed-by-binary-search trick that > Linus mentioned in another thread. This should hopefully make lookup > fairly cheap. BTW, if there is a similar "pointee pointer"-type format > already being used in git, I'd be happy to use that instead. I looked I did a similar thing (though rather than having "lines" at all, they were fixed-length pairs of binary sha1 hashes) to implement a negative delta cache (which turned out to be a stupid idea, but the implementation worked): http://www.gelato.unsw.edu.au/archives/git/0606/23229.html > 2. Keep another file, ".git/reverse_tagmap_unsorted" in front of (1). > This file has exactly the same format, minus the sorting. It exists just > to make insertion cheap. Once this file reaches a certain size (i.e. > when trawling it on lookup becomes slightly painful), we shuffle the > entries into the sorted file (this would happen automatically on > insertion of an entry, and should _not_ have to be triggered by 'git-gc' > etc.). The implementation I mentioned above collects several "to be inserted" entries (in memory) and then merge sorts the two lists into a new file. So it was fast in terms of comparisons, but it involved writing O(n) entries, which is probably bad for creating a single note. > Of course, if we think insertion directly into (1) will never be too > expensive, we can drop (2) altogether. You can always find the right spot, memmove everything down one slot, and insert. But that means: - the cost of insertion will be proportional to the number of items, whereas using an unsorted journal you get to amortize that cost over many insertions - if you update in place, you have to lock the db for reading while you are moving around. If you are always either appending to the journal or merging into a new file, you can avoid this. -Peff - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html