Re: [PATCH 08/12] merge-ort: provide a merge_get_conflicted_files() helper function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 4, 2022 at 3:10 PM Johannes Schindelin
<Johannes.Schindelin@xxxxxx> wrote:
>
> Hi Elijah,
>
> On Sat, 29 Jan 2022, Elijah Newren wrote:
>
> > On Sat, Jan 29, 2022 at 12:23 AM Johannes Sixt <j6t@xxxxxxxx> wrote:
> > >
> > > Just a heckling from the peanut gallery...
> > >
> > > Am 29.01.22 um 07:08 schrieb Elijah Newren:
> > > > On Fri, Jan 28, 2022 at 8:55 AM Johannes Schindelin
> > > > <Johannes.Schindelin@xxxxxx> wrote:
> > > >> Meaning: Even if stage 3 is missing from the first conflict and stage 1 is
> > > >> missing from the second conflict, in the output we would see stages 1, 2,
> > > >> 2, 3, i.e. a duplicate stage 2, signifying that we're talking about two
> > > >> different conflicts.
> > > >
> > > > I don't understand why you're fixating on the stage here.  Why would
> > > > you want to group all the stage 2s together, count them up, and then
> > > > determine there are N conflicting files because there are N stage 2's?
> > >
> > > Looks like you are misunderstanding Dscho's point: When you have two
> > > conflicts, the first with stages 1 and 2, the second with stages 2 and
> > > 3, then the 2s occur lumped together when the 4 lines are printed in a
> > > row, and that is the cue to the parser where the new conflict begins.
> > > Dscho did not mean that all N 2s of should be listed together.
> >
> > Ah, so...I didn't understand his misunderstanding?  Using stages as a
> > cue to the parser where the new conflict begins is broken; you should
> > instead check for when the filename listed on a line does not match
> > the filename on the previous line.
>
> But that would break down in case of rename/rename conflicts, right?
>
> > In particular, if one conflict has stages 1 and 2, and the next conflict
> > has only stage 3, then looking at stages only might cause you to
> > accidentally lump unrelated conflicts together.
>
> Precisely. That's why I would love to have a way to deviate from the
> output of `ls-files -u`'s format, and have a reliable way to indicate
> stages that belong to the same merge conflict.

Ah, attempting to somehow identify and present logical separate
conflicts?  That could be awesome, but I'm not sure it's technically
possible.  It certainly isn't with today's merge-ort.

Let me ask some questions first...

If I understand you correctly then in the event of a rename/rename,
i.e. foo->bar & foo->baz, then you want foo's, bar's, & baz's stages
all listed together.  Right?  And in some way that you can identify
them as related?

If we do so, how do we mark the beginning and the end of what you call
"the same merge conflict"?  If you say it's always 3 stages (with the
possibility of all-zero modes/oids), then what about the rename/rename
case above modified so that the side that did foo->baz also added a
different 'bar'?  That'd be 4 non-zero modes/oids, all of them
relevant.  Or what if the side that did foo->bar also renamed
something else to 'baz', giving us even more non-zero stages for these
three paths?  Perhaps you consider these different conflicts and want
them listed separately -- if so, where does one conflict begin and
another start and which stages are parts of which conflict?

If you are attempting to somehow present the stuff that "belongs to
the same merge conflict" are you also trying to identify what kind of
merge conflict it is?  If so, do you want each type of merge conflict
listed?  For example, let's switch from the example above of logically
disjoint paths coming together to result in more than 3 stages, and
instead pick an example with a single logical path with less than
three stages.  And now let's say that path has multiple conflicts
associated with it; let's use an example with 3: rename/delete +
modify/delete + directory/file (one side renames foo->bar while
modifying the contents, the other side deletes foo and adds the
directory 'bar/').  In this case, there is a target file 'bar' that
has two non-zero modes/oids in the ls-files-u output.  If all three
types of conflicts need to be listed, does each need to be listed with
the two non-zero modes/oids (and perhaps one zero mode/oid), resulting
in six listings for 'bar'?  Or would the duplication be confusing
enough that we instead decide to list some merge conflicts with no
stages associated with them?

Thinking about both sets of questions in the last two paragraphs from
a higher level -- should we focus on and group the higher order stages
by the individual conflicts that happen, or should we group them by
the paths that they happen to (which is what `ls-files -u` happens to
do), or should we not bother grouping them and instead duplicate the
higher order stages for each logical conflict it is part of?

As an alternative to duplicating higher order stages, do we sometimes
decide to "lump" separate conflicts together and treat them as one
conflict?  If so, what are the rules on how we decide to lump
conflicts and when not to?  Is there a bright line boundary?  And can
it be done without sucking in arbitrarily more stages for a single
conflict?


Some testcases that might be useful while considering the above
questions: take a look at the "rad", "rrdd", and "mod6" tests of
t6422.  How many "same merge conflicts" are there for each of those,
and what's the boundary between them?  And can you give the answer in
the form of rules that generically handle all cases, rather than just
answering these three specific cases?


I've thought about this problem long and hard before (in part because
of some conversations I had with Edward Thompson about libgit2 and
merging at Git Merge 2020).  It wasn't at all clear to me that libgit2
had considered anything beyond simple rename cases.  The only rules I
ever figured out that made sense to me was "group the stages by target
filename rather than by logical conflict" (so we get `ls -files -u`
populated) and print a meant-for-human message for each logical
conflict (found in the <Informational Messages> section for
merge-tree), and make NO attempt to connect stages by conflict type.

I'm sure that's not what you wanted to hear, and maybe doesn't even
play nicely with your design.  But short of ignoring the edge and
corner cases, I don't see how to solve that problem.  If you do just
want to ignore edge and corner cases, then just ignore the
rename/rename case you brought up in the first place and just use
`ls-files -u`-type output as-is within your design.  If you don't want
to ignore edge cases and want something that works with a specific
design that somehow groups conflicted file stages by conflict type,
then we're going to have to dig into all these questions above and do
some big replumbing within merge-ort.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux