SUGGESTED FOR 'PU': Traversing objects is currently very costly, as every commit and tree must be loaded and parsed. Much time and energy could be saved by caching metadata and topological info in an efficient, easily accessible manner. Furthermore, this could improve git's interfacing potential, by providing a condensed summary of a repository's commit tree. This is a series to implement such a revision caching mechanism, aptly named rev-cache. The series will provide: - a core API to manipulate and traverse caches - an integration into the internal revision walker - a porcelain front-end providing access to users and (shell) applications - a series of tests to verify/demonstrate correctness - documentation of the API, porcelain and core concepts In cold starts rev-cache has sped up packing and walking by a factor of 4, and over twice that on warm starts. Some times on slax for the linux repository: rev-list --all --objects >/dev/null default cold 1:13 warm 0:43 rev-cache'd cold 0:19 warm 0:02 pack-objects --revs --all --stdout >/dev/null default cold 2:44 warm 1:21 rev-cache'd cold 0:44 warm 0:10 The mechanism is minimally intrusive: most of the changes take place in seperate files, and only a handful of git's existing functions are modified. Hope you find this useful. - Nick --- What I've changed in this revision set: - revise and add much to the documentation - add support for cache pointers, sorta like object alternates - change --noobjects to --no-objects - default --ignore-size to revcache.ignoresize if set, 50MB if not (pack.windowmemory tended to be too large and too variable for slice usage) - change init_rci to init_rev_cache_info - modify make_cache_slice to send back actual starts/ends - change coag_ to fuse_ - prefix structures with rev-cache-specific identifier - increase size of merge_nr (split_nr?) - replace paths_to_dec and children_to_close with single tracking stack in path generation - add fuse to gc based on configuration variable gc.revcache - bailout on obscenely large merges/branches (i.e. more than we can handle) - tweak struct bitfields for greater portability - replace parse_size with git's version - revise fuse to directly use object stores rather than load them into memory - move structures to own header - fix permissions - clean up patchset I didn't completely remove the bitfields from the structures, but altered them to each fit in a single byte. Completely removing them would cause a lot of trouble, and I figure eliminating the byte-overlap would make for sufficient portability for storage that's supposed to be transient anyway. Documentation/git-rev-cache.txt | 144 +++ Documentation/technical/rev-cache.txt | 594 +++++++++ Makefile | 2 + builtin-gc.c | 9 + builtin-rev-cache.c | 322 +++++ builtin.h | 1 + commit.c | 2 + git.c | 1 + list-objects.c | 49 +- rev-cache.c | 2217 +++++++++++++++++++++++++++++++++ rev-cache.h | 100 ++ revision.c | 89 ++- revision.h | 44 +- t/t6015-rev-cache-list.sh | 251 ++++ tree.h | 1 + 15 files changed, 3801 insertions(+), 25 deletions(-) create mode 100644 Documentation/git-rev-cache.txt create mode 100644 Documentation/technical/rev-cache.txt create mode 100644 builtin-rev-cache.c create mode 100644 rev-cache.c create mode 100644 rev-cache.h create mode 100755 t/t6015-rev-cache-list.sh -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html