Re: Comparing rebase --am with --interactive via p3400

Alban Gruin <alban.gruin@xxxxxxxxx> · Fri, 27 Dec 2019 22:11:36 +0100

Hi Johannes & Elijah,

Le 01/02/2019 à 07:04, Johannes Schindelin a écrit :
> Hi Elijah,
> 
> as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the
> --am backend) and then with --keep-empty to force the interactive backend
> to be used. Here are the best of 10, on my relatively powerful Windows 10
> laptop, with current `master`.
> 
> With regular rebase --am:
> 
> 3400.2: rebase on top of a lot of unrelated changes             5.32(0.06+0.15)
> 3400.4: rebase a lot of unrelated changes without split-index   33.08(0.04+0.18)
> 3400.6: rebase a lot of unrelated changes with split-index      30.29(0.03+0.18)
> 
> with --keep-empty to force the interactive backend:
> 
> 3400.2: rebase on top of a lot of unrelated changes             3.92(0.03+0.18)
> 3400.4: rebase a lot of unrelated changes without split-index   33.92(0.03+0.22)
> 3400.6: rebase a lot of unrelated changes with split-index      38.82(0.03+0.16)
> 
> I then changed it to -m to test the current scripted version, trying to
> let it run overnight, but my laptop eventually went to sleep and the tests
> were not even done. I'll let them continue and report back.
> 
> My conclusion after seeing these numbers is: the interactive rebase is
> really close to the performance of the --am backend. So to me, it makes a
> total lot of sense to switch --merge over to it, and to make --merge the
> default. We still should investigate why the split-index performance is so
> significantly worse, though.
> 
> Ciao,
> Dscho
> 

I investigated a bit on this.  From a quick glance at a callgrind trace,
I can see that ce_write_entry() is called 20 601[1] times with `git am',
but 739 802 times with the sequencer when the split-index is enabled.

For reference, here are the timings, measured on my Linux machine, on a
tmpfs, with git.git as the repo:

`rebase --am':
> 3400.2: rebase on top of a lot of unrelated changes             0.29(0.24+0.03)
> 3400.4: rebase a lot of unrelated changes without split-index   6.77(6.51+0.22)
> 3400.6: rebase a lot of unrelated changes with split-index      4.43(4.29+0.13)
`rebase --quiet':
> 3400.2: rebase on top of a lot of unrelated changes             0.24(0.21+0.02)
> 3400.4: rebase a lot of unrelated changes without split-index   5.60(5.32+0.27)
> 3400.6: rebase a lot of unrelated changes with split-index      5.67(5.40+0.26)

This comes from two things:

1. There is not enough shared entries in the index with the sequencer.

do_write_index() is called only by do_write_locked_index() with `--am',
but is also called by write_shared_index() with the sequencer once for
every other commit.  As the latter is only called by
write_locked_index(), which means that too_many_not_shared_entries()
returns true for the sequencer, but never for `--am'.

Removing the call to discard_index() in do_pick_commit() (as in the
first attached patch) solve this particular issue, but this would
require a more thorough analysis to see if it is actually safe to do.

After this, ce_write() is still called much more by the sequencer.

Here are the results of `rebase --quiet' without discarding the index:

> 3400.2: rebase on top of a lot of unrelated changes             0.23(0.19+0.04)
> 3400.4: rebase a lot of unrelated changes without split-index   5.14(4.95+0.18)
> 3400.6: rebase a lot of unrelated changes with split-index      5.02(4.87+0.15)
The performance of the rebase is better in the two cases.


2. The base index is dropped by unpack_trees_start() and unpack_trees().

Now, write_shared_index() is no longer called and write_locked_index()
is less expensive than before according to callgrind.  But
ce_write_entry() is still called 749 302 times (which is even more than
before.)

The only place where ce_write_entry() is called is in a loop in
do_write_index().  The number of iterations is dictated by the size of
the cache, and there is a trace2 probe dumping this value.

For `--am', the value goes like this: 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4,
4, 4, 5, 5, 5, 5, … up until 101.

For the sequencer, it goes like this: 1, 1, 3697, 3697, 3698, 3698,
3699, 3699, … up until 3796.

The size of the cache is set in prepare_to_write_split_index().  It
grows if a cache entry has no index (most of them should have one by
now), or if the split index has no base index (with `--am', the split
index always has a base.)  This comes from unpack_trees_start() -- it
creates a new index, and unpack_trees() does not carry the base index,
hence the size of the cache.

The second attached patch (which is broken for the non-interactive
rebase case) demonstrates what we could expect for the split-index case
if we fix this:

> 3400.2: rebase on top of a lot of unrelated changes             0.24(0.21+0.03)
> 3400.4: rebase a lot of unrelated changes without split-index   5.81(5.62+0.17)
> 3400.6: rebase a lot of unrelated changes with split-index      4.76(4.54+0.20)
So, for everything related to the index, I think that’s it.

[1] Numbers may vary, but they should remain in the same order of magnitude.

Cheers,
Alban

diff --git a/sequencer.c b/sequencer.c
index 1bee26ebd5..2831abd0fa 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1863,7 +1863,6 @@ static int do_pick_commit(struct repository *r,
 				       NULL, 0))
 			return error_dirty_index(r, opts);
 	}
-	discard_index(r->index);
 
 	if (!commit->parents)
 		parent = NULL;
diff --git a/merge-recursive.c b/merge-recursive.c
index 11869ad81c..47f67079f3 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -421,7 +421,7 @@ static int unpack_trees_start(struct merge_options *opt,
 {
 	int rc;
 	struct tree_desc t[3];
-	struct index_state tmp_index = { NULL };
+	/* struct index_state tmp_index = { NULL }; */
 
 	memset(&opt->priv->unpack_opts, 0, sizeof(opt->priv->unpack_opts));
 	if (opt->priv->call_depth)
@@ -432,7 +432,7 @@ static int unpack_trees_start(struct merge_options *opt,
 	opt->priv->unpack_opts.head_idx = 2;
 	opt->priv->unpack_opts.fn = threeway_merge;
 	opt->priv->unpack_opts.src_index = opt->repo->index;
-	opt->priv->unpack_opts.dst_index = &tmp_index;
+	opt->priv->unpack_opts.dst_index = opt->repo->index;
 	opt->priv->unpack_opts.aggressive = !merge_detect_rename(opt);
 	setup_unpack_trees_porcelain(&opt->priv->unpack_opts, "merge");
 
@@ -449,8 +449,8 @@ static int unpack_trees_start(struct merge_options *opt,
 	 * saved copy.  (verify_uptodate() checks src_index, and the original
 	 * index is the one that had the necessary modification timestamps.)
 	 */
-	opt->priv->orig_index = *opt->repo->index;
-	*opt->repo->index = tmp_index;
+	/* opt->priv->orig_index = *opt->repo->index; */
+	/* *opt->repo->index = tmp_index; */
 	opt->priv->unpack_opts.src_index = &opt->priv->orig_index;
 
 	return rc;