Re: [PATCH] filter-branch: resolve $commit^{tree} in no-index case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 19, 2016 at 04:59:28PM -0500, Jeff King wrote:

> We could get away "git diff --exit-code $1 $2" to do a single process
> invocation (rather than two rev-parses), but I don't know if it is worth
> the complexity.

And here's that patch.

I'm actually a little iffy on it because it switches to "diff-tree" from
a raw-sha1 comparison. For a well-formed repo, that shouldn't matter.
But what if you had a commit that was replacing a malformed tree object,
but not otherwise changing the diff?  We might drop it as "empty", even
though you'd prefer to keep it.

I guess some alternative ways of doing the same thing are:

  1. Use "git rev-parse $1 $3^{tree}", parse the output and compare.
     Surprisingly tedious to do in a shell without invoking any extra
     sub-processes.

  2. When we do the initial rev-list, output the tree, too. Then we get
     the old tree sha1 for "free" (no extra process, but at the cost of
     slightly more I/O by the shell).

I suspect we could also cache the tree of the parent and avoid the
rev-parse in git_commit_non_empty_tree(), too.

I dunno. I'm inclined to say that none of this is worth it to try to
drop one or two processes. Writing filter-branch in a better language
(or just using BFG) would probably be a more productive use of time.
20% looks like a lot, but that's because it's pretty fast in the first
place.  The timings I showed earlier (and below) are for git.git, which
is not that huge. But the savings from 348d4f2 are really about avoiding
looking at the trees entirely; the bigger your tree, the more you save.
Running it on linux.git should show that we're still reclaiming most of
the original optimization.

So I'm inclined to go with the "conservative" fix I sent initially, and
drop this micro-optimization.

-- >8 --
Subject: [PATCH] filter-branch: optimize out rev-parse for unchanged tree

The prior commit noticed that we started feeding
"$commit^{tree}" to git_commit_non_empty_tree instead of its
resolved 40-hex sha1. It made the conservative fix of
resolving and passing the 40-hex sha1, even though it does
add some overhead.

This patch takes the less conservative choice, under the
assumption that we are unlikely to break anybody's
commit-filter by showing them the unresolved name
"$commit^{tree}" (which is probably the case, because most
people would end their filter with "git commit-tree" or
"git_commit_non_empty_tree").

We can make the comparison in the latter function more
clever by using "diff-tree --quiet", which will compare the
two trees with only a single process invocation.

Here are the resulting numbers from p7000:

  Test                  348d4f2^          348d4f2                  HEAD^                    HEAD
  ----------------------------------------------------------------------------------------------------------------
  7000.2: noop filter   9.48(4.29+0.41)   3.73(0.22+0.26) -60.7%   4.56(0.27+0.26) -51.9%   3.74(0.25+0.25) -60.5%

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 git-filter-branch.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 5e094ce..d1879e3 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -44,7 +44,7 @@ skip_commit()
 # it will skip commits that leave the tree untouched, commit the other.
 git_commit_non_empty_tree()
 {
-	if test $# = 3 && test "$1" = $(git rev-parse "$3^{tree}"); then
+	if test $# = 3 && git diff-tree --quiet "$1" "$3"; then
 		map "$3"
 	else
 		git commit-tree "$@"
@@ -404,7 +404,7 @@ while read commit parents; do
 	then
 		tree=$(git write-tree)
 	else
-		tree=$(git rev-parse "$commit^{tree}")
+		tree="$commit^{tree}"
 	fi
 	workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
 		"$tree" $parentstr < ../message > ../map/$commit ||
-- 
2.7.0.248.g5eafd77

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]