Hi Elijah, On Sat, 9 Jun 2018, Elijah Newren wrote: > On Sat, Jun 9, 2018 at 3:11 PM, Johannes Schindelin > <Johannes.Schindelin@xxxxxx> wrote: > > On Thu, 7 Jun 2018, Elijah Newren wrote: > > > >> am-based rebases suffer from an reduced ability to detect directory > >> renames upstream, which is fundamental to the fact that it throws > >> away information about the history: in particular, it dispenses with > >> the original commits involved by turning them into patches, and > >> without the knowledge of the original commits we cannot determine a > >> proper merge base. > >> > >> The am-based rebase will proceed by first trying git-apply, and, only > >> if it fails, will try to reconstruct a provisional merge base using > >> build_fake_ancestor(). This fake ancestor will ONLY contain the > >> files modified in the patch; without the full list of files in the > >> real merge base, renames of any missing files cannot be detected. > >> Directory rename detection works by looking at individual file > >> renames and deducing when a full directory has been renamed. > >> > >> Trying to modify build_fake_ancestor() to instead create a merge base > >> that includes common file information by looking for a commit that > >> contained all the same blobs as the pre-image of the patch would > >> require a very expensive search through history, may find the wrong > >> commit, and outside of rebase may not find any commit that matches. > >> > >> But we had all the relevant information to start. So, instead of > >> attempting to fix am which just doesn't have the relevant > >> information, instead note its strength and weaknesses, and change the > >> default rebase machinery to interactive since it does not suffer from > >> the same problems. > > > > I'll let Eric comment on the grammar, and I'll comment on the idea > > behind this commit instead. > > Going to dump the hard job on Eric, eh? ;-) Just trying to learn how to delegate. :0) > > IMHO `git rebase -i` is still too slow to be a true replacement for > > `git rebase --am` for the cases where it serves the user well. Maybe > > we should work on making `rebase -i` faster, first? > > That sounds fair. > > > I imagine, for example, that it might make *tons* of sense to avoid > > writing out the index and worktree files all the time. That was > > necessary in the shell script version because if the ridiculous > > limitations we subjected ourselves to, such as: no object-oriented > > state worth mentioning, only string-based processing, etc. But we > > could now start to do everything in memory (*maybe* write out the new > > blob/tree/commit objects immediately, but maybe not) until the time > > when we either have succeeded in the rebase, or when there was a > > problem and we have to exit with an error. And only then write the > > files and the index. > > Hmm...are you still planning on using cherry-pick (internally rather > than forking, of course)? The sequencer side-steps the cherry-pick command by calling merge_recursive() directly. But yes, this is what the interactive rebase will always use. > Because cherry-pick uses the merge-recursive machinery, and the > merge-recursive machinery doesn't have a nice way of avoiding writing > to the working tree or index. Exactly. I think it could be taught to perform its magic in memory. I am not saying that it is necessarily easy to teach merge-recursive.c this trick, but it should be possible. > Fixing that is on my radar; see the first block of > https://public-inbox.org/git/CABPp-BG2fZHm3s-yrzxyGj3Eh+O7_LHLz5pgstHhG2drigSyRA@xxxxxxxxxxxxxx/ Great! > (reading up until "At this point, I'd rather just fix the design flaw > rather than complicate the code further.") > > However, also covered in my plans is a few things to speed up the > merge-recursive machinery, which should provide some other performance > benefits for interactive rebases. I am looking forward to see this materialize. > > In any case, I think that the rather noticeable change of the default > > would probably necessitate a deprecation phase. > > Why is it a "rather noticable change of the default"? I was really referring to speed. But I have to admit that I do not have any current numbers. Another issue just hit me, though: rebase --am does not need to look at as many Git objects as rebase --merge or rebase -i. Therefore, GVFS users will still want to use --am wherever possible, to avoid "hydrating" many objects during their rebase. > If we popped up the editor and asked the user to edit the list of > options, I'd agree, or if folks thought that it was significantly slower > by a big enough margin (though you already suggested waiting and making > sure we don't do that). What else remains that qualifies? > > (Okay, the default behavior to just skip empty patches rather than halt > the rebase and ask the user to advise is different, but we could fix > that up too. Is there anything else?) That is indeed a change in behavior that is rather easy to address. As to speed: that might be harder. But then, the performance might already be good enough. I do not have numbers (nor the time to generate them) to back up my hunch that --am is substantially faster than --merge. Ciao, Dscho