On Mon, Jun 2, 2008 at 7:50 AM, Geoffrey Irving <irving@xxxxxxx> wrote: > On Sun, Jun 1, 2008 at 11:13 PM, Johannes Schindelin > <Johannes.Schindelin@xxxxxx> wrote: >> Hi, >> >> On Sun, 1 Jun 2008, Geoffrey Irving wrote: >> >>> The dominant cost of git-cherry is the computation of patch-ids for each >>> relevant commit. Once computed, these pairs are now stored in a hash >>> table in $GIT_DIR/patch-id-cache to speed up repeated invocations. >>> >>> The basic structure of patch-id-cache.c was cannibalized from Johannes >>> Schindelin's notes-index structure, though most of the code was >>> rewritten. The hash table is still kept strictly sorted by commit, but >>> the entire table is now read into memory. >> >> I do not think that this "read-the-entire-table-into-memory" paradigm is a >> wise choice. mmap()ing, I would have understood, but reading a potentially >> pretty large table into memory? > > I'll switch to mmapping. The git_mmap function in compat/mmap.c dies if NO_MMAP is defined and the map isn't MAP_PRIVATE. If I want to write an entry, will the memory be automatically updated if I write directly to the file descriptor (I haven't used mmap before)? Also, do you think it's okay to write directly into the mmap'ed memory for every insertion, or should I try to be fancier? Immediate writing would simplify the code a lot, and I don't think there's a significant performance issue since computing an entry is expensive. Thanks, Geoffrey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html