On Sun, Sep 23, 2018 at 03:53:38PM +0000, brian m. carlson wrote: > I suspect you're gaining speed mostly because you're running three > processes total instead of at least one process (sh) per commit. So I > don't think there's anything that Git can do to make this faster on our > end without a redesign. It's not just the process startup overhead that makes it faster. Using multiple processes means they have to communicate somehow. In this case, git-read-tree is writing out the whole index for each commit, which git-rm reads in and modifies, and then git-commit-tree finally converts back to a tree. In addition to the raw CPU of that work, there's a bunch of latency as each step is performed serially. Whereas in the proposed pipeline, fast-export is writing out a diff and fast-import is turning that directly back into tree objects. And both processes are proceeding independently, so you benefit from multiple cores. Which isn't to say I really disagree with "Git can't really make this faster". filter-branch has a ton of power to let you replay arbitrary commands (including non-Git commands!), so the speed tradeoff in its approach is very intentional. If we could modify the index in-place that would probably make it a little faster, but that probably counts as "redesign" in your statement. ;) -Peff