It's been about 5 days with no further feedback, other than some timings from Dscho for Windows showing that my fixes help there too. So, I did one last re-read, made a couple small wording tweaks, and am resending as ready for inclusion. Changes since v4: * Included the windows timings from Dscho in the commit messages for the first two perf patches * A few slight wording tweaks to the manpage Elijah Newren (4): t6006: simplify and optimize empty message test t3427: accelerate this test by using fast-export and fast-import Recommend git-filter-repo instead of git-filter-branch t9902: use a non-deprecated command for testing Documentation/git-fast-export.txt | 6 +- Documentation/git-filter-branch.txt | 273 +++++++++++++++++++++++++--- Documentation/git-gc.txt | 17 +- Documentation/git-rebase.txt | 3 +- Documentation/git-replace.txt | 10 +- Documentation/git-svn.txt | 10 +- Documentation/githooks.txt | 10 +- contrib/svn-fe/svn-fe.txt | 4 +- git-filter-branch.sh | 13 ++ t/t3427-rebase-subtree.sh | 24 ++- t/t6006-rev-list-format.sh | 5 +- t/t9902-completion.sh | 12 +- 12 files changed, 310 insertions(+), 77 deletions(-) Range-diff: 1: 7ddbeea2ca ! 1: ccea0e5846 t6006: simplify and optimize empty message test @@ Commit message Despite only being one piece of the 71st test and there being 73 tests overall, this small change to just this one test speeds up the overall execution time of t6006 (as measured by the best of 3 runs of `time - ./t6006-rev-list-format.sh`) by about 11% on Linux and by 13% on - Mac. + ./t6006-rev-list-format.sh`) by about 11% on Linux, 13% on Mac, and + about 15% on Windows. Signed-off-by: Elijah Newren <newren@xxxxxxxxx> 2: e1e63189c1 ! 2: 6d73135006 t3427: accelerate this test by using fast-export and fast-import @@ Commit message fast-export and fast-import can easily handle the simple rewrite that was being done by filter-branch, and should be significantly faster on - systems with a slow fork. Timings from before and after on two laptops - that I have access to (measured via `time ./t3427-rebase-subtree.sh`, - i.e. including everything in this test -- not just the filter-branch or - fast-export/fast-import pair): + systems with a slow fork. Timings from before and after on a few + laptops that I or others measured on (measured via `time + ./t3427-rebase-subtree.sh`, i.e. including everything in this test -- + not just the filter-branch or fast-export/fast-import pair): - Linux: 4.305s -> 3.684s (~17% speedup) - Mac: 10.128s -> 7.038s (~30% speedup) + Linux: 4.305s -> 3.684s (~17% speedup) + Mac: 10.128s -> 7.038s (~30% speedup) + Windows: 1m 37s -> 1m 17s (~26% speedup) Signed-off-by: Elijah Newren <newren@xxxxxxxxx> 3: ed6505584f ! 3: 2f225c8697 Recommend git-filter-repo instead of git-filter-branch @@ Documentation/git-filter-branch.txt: warned. +document or provide to a coworker, who then runs them on a different OS +where the same commands are not working/tested (some examples in the +git-filter-branch manpage are also affected by this). BSD vs. GNU -+userland differences can really bite. If you're lucky, you get ugly -+error messages spewed. But just as likely, the commands either don't do -+the filtering requested, or silently corrupt making some unwanted -+change. The unwanted change may only affect a few commits, so it's not -+necessarily obvious either. (The fact that problems won't necessarily -+be obvious means they are likely to go unnoticed until the rewritten -+history is in use for quite a while, at which point it's really hard to -+justify another flag-day for another rewrite.) ++userland differences can really bite. If lucky, error messages are ++spewed. But just as likely, the commands either don't do the filtering ++requested, or silently corrupt by making some unwanted change. The ++unwanted change may only affect a few commits, so it's not necessarily ++obvious either. (The fact that problems won't necessarily be obvious ++means they are likely to go unnoticed until the rewritten history is in ++use for quite a while, at which point it's really hard to justify ++another flag-day for another rewrite.) + +* Filenames with spaces are often mishandled by shell snippets since +they cause problems for shell pipelines. Not everyone is familiar with +find -print0, xargs -0, git-ls-files -z, etc. Even people who are -+familiar with these may assume such needs are not relevant because ++familiar with these may assume such flags are not relevant because +someone else renamed any such files in their repo back before the person -+doing the filtering joined the project. And, often, even those familiar -+with handling arguments with spaces my not do so just because they ++doing the filtering joined the project. And often, even those familiar ++with handling arguments with spaces may not do so just because they +aren't in the mindset of thinking about everything that could possibly +go wrong. + +* Non-ascii filenames can be silently removed despite being in a desired -+directory. The desire to select paths to keep often use pipelines like ++directory. Keeping only wanted paths is often done using pipelines like +`git ls-files | grep -v ^WANTED_DIR/ | xargs git rm`. ls-files will -+only quote filenames if needed so folks may not notice that one of the -+files didn't match the regex, again until it's much too late. Yes, -+someone who knows about core.quotePath can avoid this (unless they have -+other special characters like \t, \n, or "), and people who use ls-files -+-z with something other than grep can avoid this, but that doesn't mean -+they will. ++only quote filenames if needed, so folks may not notice that one of the ++files didn't match the regex (at least not until it's much too late). ++Yes, someone who knows about core.quotePath can avoid this (unless they ++have other special characters like \t, \n, or "), and people who use ++ls-files -z with something other than grep can avoid this, but that ++doesn't mean they will. + +* Similarly, when moving files around, one can find that filenames with +non-ascii or special characters end up in a different directory, one @@ Documentation/git-filter-branch.txt: warned. +the same name, no warning or error is provided; git-filter-branch simply +overwrites each tag in some undocumented pre-defined order resulting in +only one tag at the end. (A git-filter-branch regression test requires -+this.) ++this surprising behavior.) + -+Also, the poor performance of git-filter-branch often leads to safety issues: ++Also, the poor performance of git-filter-branch often leads to safety ++issues: + +* Coming up with the correct shell snippet to do the filtering you want +is sometimes difficult unless you're just doing a trivial modification 4: ca8e124cb3 = 4: 048eba375b t9902: use a non-deprecated command for testing -- 2.23.0.39.gf92d9de5c3