On Sun, Jun 1, 2008 at 11:42 PM, Jeff King <peff@xxxxxxxx> wrote: > On Mon, Jun 02, 2008 at 07:13:14AM +0100, Johannes Schindelin wrote: > >> I do not think that this "read-the-entire-table-into-memory" paradigm is a >> wise choice. mmap()ing, I would have understood, but reading a potentially >> pretty large table into memory? > > When I was just a git-youth, I wrote a fast mmap-based cache for storing > SHA1 pairs. It might give some direction. You should be able to find it > here: > > http://mid.gmane.org/20060629035849.GA30749@xxxxxxxxxxxxxxxxxxxxxxx > > It mmaps and binary searches a sorted list. New entries are added to an > in-memory list, and then at the end of a run, the two sorted lists are > merged to create the new on-disk version. I don't need sorting (and neither did you), so I think a hash table is better (O(1) instead of O(log n), and we don't even need to compute hash keys. I'll leave it up to you and Dscho (or anyone else who cares to chime in) which one you think I should do. Geoffrey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html