Re: inexplicable failure to merge recursively across cherry-picks

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 10 Oct 2007 08:25:15 -0700 (PDT)

On Wed, 10 Oct 2007, martin f krafft wrote:
> also sprach Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> [2007.10.10.0354 +0100]:
> > Cherry-picking is immaterial. It doesn't matter how the changes
> > come into the tree. It doesn't matter what the history is. The
> > only thing git cares about is the content, and the end result.
> 
> This is the part I over-estimated. I thought that Git would figure
> out that commits 1-3 had been merged into the target and thus apply,
> in sequence, only the commits from the source which had not been
> merged.

Yes, *some* SCM's have tried to do that. In particular, the ones that are 
"patch-based" tend to think that patches are "identical" regardless of 
where they are, and while re-ordering of them is a special event, it's not 
somethign that changes the fundamental 'ID' of the patch.

For example, I think the darcs "patch algebra" works that way.

It's a really horrible model. Not only doesn't it scale, but it leads to 
various very strange linkages between patches, and it fails the most 
important part: it means that merges get different results just because 
people are doing the same changes two different ways.

> Many thanks (again), Linus! Looking forward to your next content
> manager; you know, the one with artificial intelligence built in!
> You could call it "wit" :)

Well, the git model is really largely the reverse: the system is supposed 
to be as *stupid* as humanly possible, but:

 - make it predictable exactly because it's stupid and doesn't do anything 
   even half-ways smart.

   This is part of the "it doesn't matter *how* you got to a particular 
   state, git will always do the same thing regardless of whether you 
   moved an existing patch around or whether you re-did the changes as 
   (possibly more than one) new and unrelated commits".

 - conflicts aren't bad - they're *good*. Trying to aggressively resolve 
   them automatically when two branches have done slightly different 
   things in the same area is stupid and just results in more problems.

   Instead, git tries to do what I don't think *anybody* else has done: 
   make the conflicts easy to resolve, by allowing you to work with them 
   in your normal working tree, and still giving you a lot of tools to 
   help you see what's going on.

So git doesn't try to avoid conflicts per se: the merge strategies are 
fundamentally pretty simple (rename detection and the whole "recursive 
merge" thing may not be simple code, but the concepts are pretty 
straightforward), and they handle all the really *obvious* cases, but at 
the same time, I feel strongly that anything even half-way subtle should 
not be left to the SCM - the SCM should show it and make it really easy 
for the user to then fix it up.

Side note: even with a totally obvious three-way merge, with absolutely 
zero conflicts even remotely close to each other, you can have the merge 
algorithm generate a good merge that doesn't actually *work*.

For example, it's happened a few times that one branch renames a structure 
member name (and changes all the uses) and another branch adds new code 
that uses the old member name. The end result: the code will *merge* fine, 
and there are zero conflicts in the content, because all the changes were 
totally disjoint, but the end result doesn't actually work or even 
compile!

So no merge strategy is ever perfect. The git approach is to be simple and 
predictable, and also to make it easy to fix up (ie even if you get the 
above kind of automatic merge problem, if you catch it in compiling, you 
can fix it up, and do a "git commit --amend" to fix up the merge itself 
before you push it out).

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html