[PATCH v5 0/4] Warn about git-filter-branch usage and avoid it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's been about 5 days with no further feedback, other than some timings
from Dscho for Windows showing that my fixes help there too.  So, I did
one last re-read, made a couple small wording tweaks, and am resending as
ready for inclusion.

Changes since v4:
  * Included the windows timings from Dscho in the commit messages for
    the first two perf patches
  * A few slight wording tweaks to the manpage

Elijah Newren (4):
  t6006: simplify and optimize empty message test
  t3427: accelerate this test by using fast-export and fast-import
  Recommend git-filter-repo instead of git-filter-branch
  t9902: use a non-deprecated command for testing

 Documentation/git-fast-export.txt   |   6 +-
 Documentation/git-filter-branch.txt | 273 +++++++++++++++++++++++++---
 Documentation/git-gc.txt            |  17 +-
 Documentation/git-rebase.txt        |   3 +-
 Documentation/git-replace.txt       |  10 +-
 Documentation/git-svn.txt           |  10 +-
 Documentation/githooks.txt          |  10 +-
 contrib/svn-fe/svn-fe.txt           |   4 +-
 git-filter-branch.sh                |  13 ++
 t/t3427-rebase-subtree.sh           |  24 ++-
 t/t6006-rev-list-format.sh          |   5 +-
 t/t9902-completion.sh               |  12 +-
 12 files changed, 310 insertions(+), 77 deletions(-)

Range-diff:
1:  7ddbeea2ca ! 1:  ccea0e5846 t6006: simplify and optimize empty message test
    @@ Commit message
         Despite only being one piece of the 71st test and there being 73 tests
         overall, this small change to just this one test speeds up the overall
         execution time of t6006 (as measured by the best of 3 runs of `time
    -    ./t6006-rev-list-format.sh`) by about 11% on Linux and by 13% on
    -    Mac.
    +    ./t6006-rev-list-format.sh`) by about 11% on Linux, 13% on Mac, and
    +    about 15% on Windows.
     
         Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
     
2:  e1e63189c1 ! 2:  6d73135006 t3427: accelerate this test by using fast-export and fast-import
    @@ Commit message
     
         fast-export and fast-import can easily handle the simple rewrite that
         was being done by filter-branch, and should be significantly faster on
    -    systems with a slow fork.  Timings from before and after on two laptops
    -    that I have access to (measured via `time ./t3427-rebase-subtree.sh`,
    -    i.e. including everything in this test -- not just the filter-branch or
    -    fast-export/fast-import pair):
    +    systems with a slow fork.  Timings from before and after on a few
    +    laptops that I or others measured on (measured via `time
    +    ./t3427-rebase-subtree.sh`, i.e. including everything in this test --
    +    not just the filter-branch or fast-export/fast-import pair):
     
    -       Linux:  4.305s -> 3.684s (~17% speedup)
    -       Mac:   10.128s -> 7.038s (~30% speedup)
    +       Linux:    4.305s -> 3.684s (~17% speedup)
    +       Mac:     10.128s -> 7.038s (~30% speedup)
    +       Windows:  1m 37s -> 1m 17s (~26% speedup)
     
         Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
     
3:  ed6505584f ! 3:  2f225c8697 Recommend git-filter-repo instead of git-filter-branch
    @@ Documentation/git-filter-branch.txt: warned.
     +document or provide to a coworker, who then runs them on a different OS
     +where the same commands are not working/tested (some examples in the
     +git-filter-branch manpage are also affected by this).  BSD vs. GNU
    -+userland differences can really bite.  If you're lucky, you get ugly
    -+error messages spewed.  But just as likely, the commands either don't do
    -+the filtering requested, or silently corrupt making some unwanted
    -+change.  The unwanted change may only affect a few commits, so it's not
    -+necessarily obvious either.  (The fact that problems won't necessarily
    -+be obvious means they are likely to go unnoticed until the rewritten
    -+history is in use for quite a while, at which point it's really hard to
    -+justify another flag-day for another rewrite.)
    ++userland differences can really bite.  If lucky, error messages are
    ++spewed.  But just as likely, the commands either don't do the filtering
    ++requested, or silently corrupt by making some unwanted change.  The
    ++unwanted change may only affect a few commits, so it's not necessarily
    ++obvious either.  (The fact that problems won't necessarily be obvious
    ++means they are likely to go unnoticed until the rewritten history is in
    ++use for quite a while, at which point it's really hard to justify
    ++another flag-day for another rewrite.)
     +
     +* Filenames with spaces are often mishandled by shell snippets since
     +they cause problems for shell pipelines.  Not everyone is familiar with
     +find -print0, xargs -0, git-ls-files -z, etc.  Even people who are
    -+familiar with these may assume such needs are not relevant because
    ++familiar with these may assume such flags are not relevant because
     +someone else renamed any such files in their repo back before the person
    -+doing the filtering joined the project.  And, often, even those familiar
    -+with handling arguments with spaces my not do so just because they
    ++doing the filtering joined the project.  And often, even those familiar
    ++with handling arguments with spaces may not do so just because they
     +aren't in the mindset of thinking about everything that could possibly
     +go wrong.
     +
     +* Non-ascii filenames can be silently removed despite being in a desired
    -+directory.  The desire to select paths to keep often use pipelines like
    ++directory.  Keeping only wanted paths is often done using pipelines like
     +`git ls-files | grep -v ^WANTED_DIR/ | xargs git rm`.  ls-files will
    -+only quote filenames if needed so folks may not notice that one of the
    -+files didn't match the regex, again until it's much too late.  Yes,
    -+someone who knows about core.quotePath can avoid this (unless they have
    -+other special characters like \t, \n, or "), and people who use ls-files
    -+-z with something other than grep can avoid this, but that doesn't mean
    -+they will.
    ++only quote filenames if needed, so folks may not notice that one of the
    ++files didn't match the regex (at least not until it's much too late).
    ++Yes, someone who knows about core.quotePath can avoid this (unless they
    ++have other special characters like \t, \n, or "), and people who use
    ++ls-files -z with something other than grep can avoid this, but that
    ++doesn't mean they will.
     +
     +* Similarly, when moving files around, one can find that filenames with
     +non-ascii or special characters end up in a different directory, one
    @@ Documentation/git-filter-branch.txt: warned.
     +the same name, no warning or error is provided; git-filter-branch simply
     +overwrites each tag in some undocumented pre-defined order resulting in
     +only one tag at the end.  (A git-filter-branch regression test requires
    -+this.)
    ++this surprising behavior.)
     +
    -+Also, the poor performance of git-filter-branch often leads to safety issues:
    ++Also, the poor performance of git-filter-branch often leads to safety
    ++issues:
     +
     +* Coming up with the correct shell snippet to do the filtering you want
     +is sometimes difficult unless you're just doing a trivial modification
4:  ca8e124cb3 = 4:  048eba375b t9902: use a non-deprecated command for testing
-- 
2.23.0.39.gf92d9de5c3




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux