I'm worried that my attempt to extract add/add from the rest of the discussion with rename/add and rename/rename resulted in a false sense of simplification. Trying to handle all the edge and corner cases and remain consistent sometimes gets a little hairy. Anyway... On Mon, Mar 12, 2018 at 2:35 PM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > Elijah Newren wrote: >> On Mon, Mar 12, 2018 at 11:47 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > > Sorry for the lack of clarity. My understanding was that the proposed > behavior was to write two files: > > ${path}~HEAD > ${path}~MERGE (Just to be clear: The proposed behavior was to do that only when the colliding files were dissimilar, and otherwise two-way merging.) > My proposal is instead to write one file: > > ${path} > > with the content that would have gone to ${path}~HEAD. This is what > already happens when trying to merge binary files. Thanks for clarifying. I feel the analogy to merging binary files breaks down here in more than one way: 1) Merging binary files is more complex than you suggest here. In particular, when creating a virtual merge base, the resolution chosen is not the version of the file from HEAD, but the version of the file from the merge base. Nasty problems would result otherwise for the recursive case. If we tried to match how merging binary files behaved, we run into the problem that the colliding file conflict case has no common version of the file from a merge base. So the same strategy just doesn't apply. The closest would be outright deleting both versions of the file for the virtual merge base and punting to the outer merge to deal with it. That might be okay, but seems really odd to me. I feel like it'd unnecessarily increase the number of conflicts users will see, though maybe it wouldn't be horrible if this was only done when the files were considered dissimilar anyway. 2) merging binary files only has 3 versions of the file to store at a single $path in the index, which fit naturally in stages 1-3; rename/add and rename/rename(2to1) have 4 and 6, respectively. Having three versions of a file from a 3-way merge makes fairly intuitive sense. Have 4 or 6 versions of a file at a path from a 3-way merge is inherently more complex. I think that just using the version from HEAD risks users resolving things incorrectly much more likely than in the case of a binary merge. > [...] >>> Can you add something more about the motivation to the commit message? >>> E.g. is this about performance, interaction with some tools, to >>> support some particular workflow, etc? >> >> To be honest, I'm a little unsure how without even more excessive and >> repetitive wording across commits. > > Simplest way IMHO is to just put the rationale in patch 5/5. :) In > other words, explain the rationale for the end-user facing change in the > same patch that changes the end-user facing behavior. Patches 3, 4, and 5 all change end-user facing behavior -- one patch per conflict type to make the three types behave the same (the first two patches add more testcases, and write a common function all three types can use). The rationale for the sets of changes is largely the same, and is somewhat lengthy. Should I repeat the rationale in full in all three places? >> Let me attempt here, and maybe you >> can suggest how to change my commit messages? >> >> * When files are wildly dissimilar -- as you mentioned -- it'd be >> easier for users to resolve conflicts if we wrote files out to >> separate paths instead of two-way merging them. > > Today what we do (in both the wildly-dissimilar case and the > less-dissimilar case) is write one proposed resolution to the worktree > and put the competing versions in the index. Tools like "git > mergetool" are then able to pull the competing versions out of the > index to allow showing them at the same time. > > My bias is that I've used VCSes before that wrote multiple competing > files to the worktree and I have been happier with my experience > resolving conflicts in git. E.g. at any step I can run a build to try > out the current proposed resolution, and there's less of a chance of > accidentally commiting a ~HEAD file. > > [...] >> There are three types of conflicts representing two (possibly >> unrelated) files colliding at the same path: add/add, rename/add, and >> rename/rename(2to1). add/add does the two-way merge of the colliding >> files, and the other two conflict types write the two colliding files >> out to separate paths. > > Interesting. I would be tempted to resolve this inconsistency the > other way: by doing a half-hearted two-way merge (e.g. by picking one > of the two versions of the colliding file) and marking the path as > conflicted in the index. That way it's more similar to edit/edit, > too. Your proposal is underspecified; there are more complicated cases where your wording could have different meanings. I also had a backup plan was to propose making rename/add and rename/rename(2to1) behave more like add/add (instead of making the behavior for all three a blend of their current behaviors), but the problem is I have two backup proposals to choose between. And it sounds like you're adding a third, which follows your ~HEAD strategy from above. Let me see if I can explain why your proposal is underspecified: For rename/rename(2to1) we have up to 3 sets of conflicts. Let's say we renamed A->C in HEAD and B->C on the other side. We first need to do a three-way content merge on the first C (with the other two versions of A), then we need to do a three-way content merge on the second C (with the other two versions of B). This gives us two different C's, both with possible conflict markers, that want to be stored at the same pathname. Let me call these two versions of C, C1 and C2. C1 and C2 may or may not be similar. Resolving the fact that both want to be stored at C is what gives us potentially a third set of conflicts. What's your resolution strategy here, for each combination of cases (has conflict markers or not; is similar to the other C or not; working on a virtual merge base or not)? For reference there are two possible backup strategies I could propose here: Backup Proposal #1: act stricly like add/add: Do a two-way merge of C1 and C2. If we end up with nested conflict markers, oh well. Good luck, user. Backup Proposal #2: act like add/add, but only if C1 and C2 don't have conflict markers. If C1 or C2 have conflict markers, behave as currently -- record C1 and C2 to different paths. If C1 and C2 do not have conflict markers, two-way merge them and record in the worktree at C. My question for your counter-proposal: Do you record C1 or C2 as C? Or do you record the version of C from HEAD as C? And what do you pick when creating a virtual merge base? Problems I see regardless of the choice in your counter-proposal: * If you choose C from HEAD, you are ignoring 3 other versions of "C" (as opposed to just one other version when merging a binary file); will the user recognize that and know how to look at all four versions? * If you choose C1 or C2, the user will see conflict markers; will the user recognize that this is only a _subset_ of the conflicts, and that there's a lot of content that was completely ignored? * There is no "merge base of C1 and C2" to use for the virtual merge base. What will you use?