On Thu, 12 Apr 2007, Junio C Hamano wrote: > Andy Parkins <andyparkins@xxxxxxxxx> writes: > > > > When I was rebasing, some strange things happened (without any conflict > > warnings): > > A patch that can ambiguously apply to multiple places is indeed > a problem, and in such situations merge based rebase is probably > safer as it can take advantage of the whole file as the context. Indeed. It's one of the reasons I think the default "patch" behaviour of allowing some fuzz in the patch is totally broken, and why "git apply" defaults to a much stricter "no context differences allowed" behaviour. That still doesn't entirely get *rid* of the problem, but it at least makes it slightly less common. You can still get a patch that applies cleanly if you have certain multi-line patterns that are very common, and that end up causing patch to find the wrong place to apply a patch, and still make it look right enough. This is just one reason why patch-based systems are totally and utterly broken, and why I detested the original cogito patch-based merging. The three-way merge behaviour ("git rebase --merge") is a lot safer, but it's obviously more expensive too. Also, the reason a lot of people like patch-based setups is not only is it efficient, but it turns out too many people prefer "clean merges" even *despite* the dangers. See the recent patches that were floating around on "stgit" that actually make applying patches default to the same *broken*defaults* as standard "patch" does (see the emails with subject line "Pass -C1 to git-apply in StGIT ..." for more on that). So a lot of people prefer the less strict patch application rules, because *most* of the time it actually does the right thing. Never mind that it makes a fundamental problem with patches even worse. Me, I'd prefer to have the patch fail early. But even failing early does not _guarantee_ that it applies in the right place. > But it brings up another interesting point. The ambiguous patch > is a problem even more so outside the context of rebasing, for > another reason. When rebasing, you are dealing with your own > changes and you know what and how you want each of them to > change the tree state, as opposed to applying somebody else's > patch outside the context of rebasing. > > When we only have the patch text (i.e. applymbox), there is no > "merge to use the whole file as the context" fallback. I wonder > if this is a common enough problem that we would want to make it > safer somehow... We could make "git apply" also refuse to move more than a few lines by default (ie not only does the context have to be exact, it has to actually show up where the patch claims it should be!) git-apply still allows arbitrary line offset differences (see "find_offset()"). > I can see two possible improvements. > > - On the diff generation side, we could notice that the hunk > we are going to output can be applied to more than one > location, and automatically widen the context for it. > > This is only a half-solution, as many patches do not even > come from git. It's not even a solution _within_ git. Since a patch will always apply at the right place if no changes have been done (because we always start looking at the line number where the patch fragment _claims_ it should go), the problem only occurs when independent changes have been done to the target. And those independent changes may obviously be the ones that *introduce* the new location that the patch can (incorrectly) apply to. > - Inside git-apply, apply_one_fragment(), ask find_offset() if > the hunk can match more than one location, and exit with an > error status (still writing out the patch results if it > otherwise applies cleanly) so that the user can manually > inspect and confirm. Yes, that might be a good idea, but is going to be pretty expensive. "find_offset()" wasn't exactly written to be a model of efficiency. See the comment about me not being one of the "smart and beautiful" people. So you'd want to have some smarter method of finding potential places to apply the fragment if you want to do those kinds of things. Like creating hashes of the line contents in order to not have to compare the whole fragment.. And EVEN THEN it wouldn't actually solve the problem. The most common case is simply: - somebody *already* fixed the same bug by a very similar patch (or an identical one), and thus the patch obviously won't apply, since the place it *should* apply to got changed. - so git-apply will look for another place to apply it, and it's quite possible that there is just one such place - even though it's the exact wrong place! So it really does boil down to: patches will sometimes falsely apply at the wrong place. That's just in the fundamental nature of patches. The same way a three-way merge can sometimes generate total crap, even when it merged totally cleanly. It's unusual enough that most of the time it doesn't happen (and I think it happens less with three-way merges than with patches), and hopefully most of the time the error will be obvious enough that it gets noticed quickly (ie the badly patched sources simply won't build any more!). But there is no practical way to guarantee it cannot happen. The only way to guarantee that patches apply correctly is to make sure the source file matches *exactly*. But that kind of defeats the whole point of sending patches around in the first place. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html