On Tue, Jun 20, 2017 at 02:46:22PM +0200, Lars Schneider wrote: > > > On 20 Jun 2017, at 14:32, <paul.mattke@xxxxxxx> <paul.mattke@xxxxxxx> wrote: > > > > Well this is a possibility, of course. Our problem is that our SVN > > repository contains about 220.000 revisions currently. As a colleague of > > mine said that the command you suggest might take about 4 seconds per > > revision, it would take about 10 days to do this for our whole repository. > > So of course it could save a lot of time generally if such operation could > > be done immediately during git-svn. > > You colleague is most likely correct. I suggested it as this is a one time > operation and therefore still somewhat practical from my point of view. I didn't follow this whole thread, but I happened to see this bit. I think the command in question is: git filter-branch -f --msg-filter 'perl -lape "s/^T(\d+)/#\$1/"' I know filter-branch is slow, but a msg-filter should be relatively fast. I'd be surprised at 4 seconds per revision (the main cost is kicking off a new perl process per revision). It's more like 120/sec on my machine. However, I think the fastest way would be to do it with fast-export, where you can just tweak the stream as it flows through: # set up a new repo to hold the results; we won't bother # copying the blobs, so just point at the current repo as an # alternate. git init fixed-repo echo "../../../.git/objects" >fixed-repo/.git/objects/info/alternates git fast-export --no-data --all | perl -ne ' # look for "data" chunks which contain the commit message if (/^data (\d+)/) { read STDIN, my $buf, $1; $buf =~ s/^T(\d+)/#$1/; print "data ", length($buf), "\n"; print $buf; } else { print; } ' | git -C fixed-repo fast-import That runs at about 3600 commits/sec on my machine. Most of that time goes to doing a tree diff on each commit. Technically that is not required for your use case, but I don't think there's a way to get fast-export to skip that (and it's an inherent part of the fast-import stream). It's probably fast enough, but it's possible that a specialized tool like BFG repo cleaner[1] could do better (I don't know offhand if it handles commit message rewrites or not). -Peff [1] https://rtyley.github.io/bfg-repo-cleaner/