On Thu, Aug 17, 2017 at 12:22:19PM -0700, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > Of course this is a pretty ridiculous input in the first place. In > > theory it _could_ be figured out, but overlapping hunks like this are > > always going to cause problems (in this case, context from the second > > hunk became a removal, and the second hunk no longer applies). > > To be honest, I do not think it could be figured out by "git apply", > but "git add -p" _should_ be able to. I've said already this > earlier in the discussion: > > https://public-inbox.org/git/7vab5cn7wr.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/ > > "add -p" knows how the hunk _before_ it gave the user to edit looked > like, and it knows how the hunk after the user edited looks like, so > it may have enough information to figure it out. But the "(e)dit" > feature was done in a way to discard that information and forced an > impossible "guess what the correct shape of the patch would be, out > of this broken patch input" with --recount and --allow-overlap. > > If we want to make "add -p/(e)dit" work in the corner cases and make > it trustworthy, "apply" is a wrong tree to back at. A major part of > the effort to fix would go to "add -p", not to "apply". In theory "add -p" could give all the information it has to git-apply in some form, and it could figure it out. That may sound crazy, but I just wonder if it makes things easier because git-apply already knows quite a lot about how to parse and interpret diffs. Whereas "add -p" is largely stupid and just parroting back lines. But I certainly wouldn't make such a decision until figuring out the actual algorithm, which should then make it obvious where is the best place to implement it. :) So just thinking about this particular situation. We have two hunks after the split: @@ -1, +1,3 @@ +a +b x and @@ -1, +3,2 @@ x +c The root of the issue, I think, is that the hunk header for the second one is bogus. It claims to start at the beginning, but of course if the other hunk is applied first, it doesn't. The right header in that case would be (assuming we do not copy the extra context): @@ -3, +3,2 @@ x +c One point of interest is that without an (e)dit, we get this case right, despite the bogus hunk header. I'm not quite sure why that is. After the failing edit we would end up with: @@ -1, +1,3 @@ +b x @@ -1, +3,2 @@ x +c Clearly the count in the first one is now wrong, but we fix that with --recount. But it seems to me that the second hunk is equally wrong as it would be in the non-edit case. I guess in addition to the "-1" being a "-2", the "+3" should now also be a "+2". But I didn't think that would impact us finding the preimage side. So that's one question I'm puzzled by: why does it work without an edit, but fail with one. My second question is how "add -p" could generate the corrected hunk headers. In the non-edit case, we know that our second hunk's leading context must always match the first hunk's trailing context (since that's how we do a split). So we'd know that the preimage "-pos" field of the second hunk header should be offset by the net sum of the added/changed lines for any previous split hunks. In the original case we had net two added lines in the first hunk. So if it's selected, that's how our "-1" becomes "-3". And after the edit, we should be able to apply the same, I think? We have a single added line, so it becomes "-2". That doesn't require knowing what was in the hunk before the user edited at all. We just need to know which other hunks it's split from, so we know which overlapping hunks we need to take into account when recomputing. "add -p" certainly has that information. I _think_ git-apply also has that information if it bothers to find out. Right now --allow-overlap is just "do not complain of overlap". But it could actively recount the offsets when it detects overlap. I'm assuming here that the edit process isn't changing the hunk headers, introducing new hunks, etc. I feel like all bets are off, then. -Peff