Hi all (and I guess in particular Elijah), On Fri, 19 Aug 2022, Johannes Schindelin via GitGitGadget wrote: > diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh > index 35f94957fce..bc580a242ac 100755 > --- a/t/t4069-remerge-diff.sh > +++ b/t/t4069-remerge-diff.sh > @@ -131,11 +131,12 @@ test_expect_success 'setup non-content conflicts' ' > test_expect_success 'remerge-diff with non-content conflicts' ' > git log -1 --oneline resolution >tmp && > cat <<-EOF >>tmp && > + diff --git a/file_or_directory b/file_or_directory > + remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead. > diff --git a/file_or_directory~HASH (side1) b/wanted_content > similarity index 100% > rename from file_or_directory~HASH (side1) > rename to wanted_content > - remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead. > diff --git a/letters b/letters > remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2). > diff --git a/letters_side2 b/letters_side2 I would have liked to avoid touching this file altogether, of course, but when I investigated, I came to the conclusion that while this patch still does not produce the best possible outcome for remerge diffs, it does improve upon the current situation. The thing is, when mentioning that we had to rename `file_or_directory` to `file_or_directory~HASH (side1)` to be able to write the file because a directory of the same name already was in the way, I would actually have expected this to come under the diff header diff --git a/file_or_directory b/file_or_directory~HASH (side1) Previously, it was under the header diff --git a/file_or_directory~HASH (side1) b/wanted_content and with this patch it is under diff --git a/file_or_directory b/file_or_directory which is still not ideal: It mentions only the original file name, not the munged one. When I looked into the implementation details I figured out that the information is not quite at our fingertips. In [`create_filepairs_for_header_only_notifications()`][create_filepairs], we have access to the primary filename (in the above example, `file_or_directory`) as the key of the entry in the strmap `o->additional_path_headers`, which is [populated from `path_messages`][additional-headers], which in turn [points to the `conflicts` strmap][path_messages]. Note: this code has changed between v2.37.2 and the current main branch, in v2.37.2 the information is still copied from `conflicts` to `path_messages`. So basically, when we generate that diff header, we know the primary path (great!) and we have access to a strmap that points to a string list of conflict messages. So from where do we get that munged path name? The string list of conflict messages [has `util` pointers to `struct logical_conflict_info` data][conflicts-strmap], which does contain the list of involved paths. This should be enough, right? But: - `struct logical_conflict_info` is declared locally in `merge-ort.c`, we do not have access to that information in `diff.c` (nor would we really _want_ to let that `merge-ort` implementation detail spill over into `diff.c`), and even if we _could_ declare it in `merge-ort.h`, there is still this problem: - The path list in `struct logical_conflict_info` is arbitrary-sized, i.e. we could even have _three_ paths in there, and that does not fit into a standard diff header. - Additionally, we generate _one_ diff header for _all_ of the conflict messages listed for a particular primary path in `additional_path_headers`. But each of these conflict messages relates to its very own list of non-primary paths. For example, one conflict message could talk about a file/directory conflict, another one could talk about conflicting renames to two different filenames (in a recursive merge, both is possible in the same merge). It is simply impossible to combine all of those conflict messages under a single diff header that matches all of the involved paths. - Even aside from that, the remerge diff code seems to have a usability wart (or even bug) where restricting the diff via a pathspec that matches a secondary path but not the primary path of a conflict (think: renames) would _not_ show the conflict message to the user, it would be shown only if the pathspec also includes the primary path. In general, I found it hard to think of a _really_ ideal design where to present that remerge conflict information. After all, in a remerge diff, we do not even write out any files, therefore that conflict message suggesting that we renamed the file to a munged name is somewhat misleading. Sure, we should mention the conflict between the file and the directory, but since nothing was written to disk, there is not actually any _need_ to mention a munged filename at all. And that goes even for the part of the remerge diff shown in the patch quoted above, where it talks about `file_or_directory~HASH (side1)` being renamed... That file never had that filename... So I am not quite sure where we want the remerge design to go from here. In any case, this remerge aspect is safely outside the scope of this here patch series, which means that this here patch series should not be concerned with it. In this email, I just wanted to mention _why_ I did not include any patch to address the remerge issues any further. Ciao, Dscho [create_filepairs]: https://github.com/git/git/blob/v2.37.2/diff.c#L6337-L6383 [additional-headers]: https://github.com/git/git/blob/v2.37.2/log-tree.c#L992 [path_messages]: https://github.com/git/git/blob/795ea8776be/merge-ort.c#L4846 [conflicts-strmap]: https://github.com/git/git/blob/795ea8776be/merge-ort.c#L351-L360