On Thu, Aug 9, 2018 at 10:16 AM Ben Peart <peartben@xxxxxxxxx> wrote: > In fact, in the other [1] patch series, we're detecting the number of > cache entries that are the same as the cache tree and using that to > traverse_by_cache_tree(). At that point, couldn't we copy the > corresponding cache tree entries over to the destination so that those > don't have to get recreated in the later call to cache_tree_update()? We could. But as I stated in another mail, I saw this cache-tree optimization as a separate problem and didn't want to mix them up. That way cache_tree_copy() could be used elsewhere if the opportunity shows up. Mixing them up could also complicate the problem. You you merge stuff, you add new cache-entries to o->result with add_index_entry() which tries to invalidate those paths in o->result's cache-tree. Right now the cache-tree is empty so it's really no-op. But if you copy cache-tree over while merging, that invalidation might either invalidate your newly copied cache-tree, or get slowed down because non-empty o->result's cache-tree means you start to need to walk it to find if there's any path to invalidate. PS. This code keeps messing me up. invalidate_ce_path() may also invalidate cache-tree in the _source_ index. For this optimization to really shine, you better keep the the original cache-tree intact (so that you can reuse as much as possible). I don't see the purpose of this source cache tree invalidation at all. My guess at this point is Linus actually made a mistake in 34110cd4e3 (Make 'unpack_trees()' have a separate source and destination index - 2008-03-06) and he should have invalidated _destination_ index instead of the source one. I'm going to dig in some more and probably will send a patch to remove this invalidation. -- Duy