On Mon, Jun 2, 2008 at 8:37 AM, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > Hi, > > On Mon, 2 Jun 2008, Geoffrey Irving wrote: > >> On Sun, Jun 1, 2008 at 11:42 PM, Jeff King <peff@xxxxxxxx> wrote: >> > On Mon, Jun 02, 2008 at 07:13:14AM +0100, Johannes Schindelin wrote: >> > >> >> I do not think that this "read-the-entire-table-into-memory" paradigm >> >> is a wise choice. mmap()ing, I would have understood, but reading a >> >> potentially pretty large table into memory? >> > >> > When I was just a git-youth, I wrote a fast mmap-based cache for >> > storing SHA1 pairs. It might give some direction. You should be able >> > to find it here: >> > >> > http://mid.gmane.org/20060629035849.GA30749@xxxxxxxxxxxxxxxxxxxxxxx >> > >> > It mmaps and binary searches a sorted list. New entries are added to >> > an in-memory list, and then at the end of a run, the two sorted lists >> > are merged to create the new on-disk version. >> >> I don't need sorting (and neither did you), so I think a hash table is >> better (O(1) instead of O(log n), and we don't even need to compute hash >> keys. I'll leave it up to you and Dscho (or anyone else who cares to >> chime in) which one you think I should do. > > My tests suggested that the lookup time advantage of hashes makes them a > more appropriate choice than sorted lists, if you look up often, but add > rarely. > > Another issue that just hit me: this cache is append-only, so if it grows > too large, you have no other option than to scratch and recreate it. > Maybe this needs porcelain support, too? (git gc?) If so, the correct operation is to go through the hash and remove entries that refer to commits that no longer exist. I can add this if you want. Hopefully somewhere along the way git-gc constructs an easy to traverse list of extant commits, and this will be straightforward. Geoffrey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html