I've fixed-up the series and they are now sitting near the tip of "pu" branch tonight. With some more testing I am hoping to merge it in "next" soon. I should recap the overall idea for general public here. When applying a series of patches to your working tree and making commits out of the results, what happens is roughly this: 1. Your index file exactly matches your HEAD commit; you may have local modifications to some paths in your working tree as long as they do not interfere with the patch application. 2. "git-apply --index" reads the index file, reads the patch, applies them to both index and the working tree, and writes the updated index file out. 3. "git-write-tree" reads the index file, writes the index entries out as a set of hierarchical tree objects, the top of which is wrapped in a new commit object that has the current HEAD as the parent, and the commit becomes the new HEAD. You repeat 1-3 as many times as you apply patches. When dealing with a huge tree, with a typical patch touching only a handful paths, this is not very efficient. A recent kernel source tree has about 1200 directories, each of which needs to be made into the "tree" object format in-core from the data in the index file, and we hash it to see if we already have that tree, and if not we write it out after compression (if we do have it, we do not do anything after checking). The updated code does this: (1) When git-write-tree writes out trees from the index, we store <directory, tree SHA1> pair for all the trees we compute, in .git/index.aux file. (2) When git-write-tree starts up, it reads the index file, and also reads the .git/index.aux file. The latter file contains a fingerprint of the former, so if the index was updated without updating the index.aux, what is read is discarded. However, if .git/index.aux is up-to-date, what it records can be reused when we scan the index. (3) There are a few commands that updates the index file. git-update-index is the most well-known one, but read-tree and apply also update it. No command other than apply updates index.aux when writing out index (yet). However, git-apply has been told about it. It reads index and index.aux file, and when it updates an index entry, it _invalidates_ the directory information it read from index.aux file, and writes the updated index.aux file out when it is done. Note. Earlier I sent out a fix ([PATCH 5/4]) that contained a two-line removal from apply.c; the code was doing more than just invalidating entries, but actually recomputing the tree. The fixed code does not hash tree information when git-apply writes index out. (4) When git-write-tree starts up after git-apply finishes its job, it notices index.aux file matches the fingerprint of the index (so it is up-to-date) and some entries have been _invalidated_. When it writes out the index, the parts of the directory hierarchy that have not been invalidated do not have to be formatted in-core into "tree" -- we already know what tree object they are. Only the parts that were invalidated need to be computed. For this, a new data structure "cache_tree" is introduced. The way to use the API is very simple. You initialize it like this: struct cache_tree *active_cache_tree; unsigned char sha1[20]; read_cache_1(sha1); active_cache_tree = read_cache_tree(sha1); The new function read_cache_1() is similar to the traditional read_cache(), but it gives back the index file fingerprint. This is given to read_cache_tree() to make sure index.aux is not stale. Then, when you touch a path in the cache, you invalidate the paths in the cache_tree with: cache_tree_invalidate_path(active_cache_tree, path); For example, if you are updating an index entry "fs/ext3/inode.c", it invalidates the cached tree SHA1 information for top-level tree, "fs" tree and "fs/ext3" tree. All other information you read from index.aux are still valid (IOW, three out of 1200 are invalidated). When you are done, you write out the index file and the updated index.aux file like this: if (write_cache_1(newfd, active_cache, active_nr, active_cache_sha1) || commit_index_file(&cache_file)) die("Unable to write new cachefile"); write_cache_tree(active_cache_sha1, active_cache_tree); Again, write_cache_1() has one extra parameter to receive the updated index file fingerprint, and you call write_cache_tree() with it. It updates index.aux file (with invalidated entries). One thing write-tree does different is this (note that it does not update the index file, it just reads from it): if (cache_tree_update(active_cache_tree, active_cache, active_nr, missing_ok)) die("git-write-tree: error building trees"); write_cache_tree(active_cache_sha1, active_cache_tree); The call to cache_tree_update() takes the (possibly partially invalidated) cache_tree, fills in the invalidated parts from the index file (this involves creation of new "tree" objects as needed). So the above call to write_cache_tree() writes a complete index.aux file out without any invalidated entries. Obviously, the change similar to what I did to apply.c in this series can be done to other index-updating commands. That's "low hanging fruit" for the week ;-). - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html