On Tue, Dec 09, 2014 at 07:52:33PM +0100, Henning Moll wrote: > i am runningthis command > > git filter-branch --env-filter 'export > GIT_COMMITTER_EMAIL="$GIT_AUTHOR_EMAIL" > GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME" GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"' > --prune-empty --tag-name-filter cat -- --all > > in a repository which i copied to /dev/shm before. According to "top", the > git process only consumes about 5 percent of the CPU. The load is between > 0.70 and 1.00. > > I assume that there is a lot of process forking going on. Could that be the > cause? Yes. filter-branch is a shell scripts, and it is probably running multiple git commands per commit it is filtering. > Any ideas how to further improve? In your case you are not touching the tree contents at all. Last time I looked into this, I believe that filter-branch always loaded the index for each commit, even if no --index-filter is being used. So teaching filter-branch to optimize this case would be one strategy. Another is to try using "git fast-export | git fast-import", and munging the data stream in between. That's may be more work, depending how fancy you want to get with accurate parsing (look into fast-export's --no-data, which omits blob data; that should make things faster and make hacky context-less parsing less likely to cause problems). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html