Re: Logical bug during MERGE or REBASE

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Sat, 03 Jul 2021 13:03:50 +0200

On Fri, Jul 02 2021, skottkuk@xxxxx wrote:

> Hello.
>
> I got a strange result in the process of "merge" and/or "rebase".

Atharva already replied to most of this, just adding on this point:

> [...]
> But as for me, it would be logical to consider the construction inside
> {} as something whole, and not just put all the changes into one heap
> with notification what all OK, no conflicts.

Git in general is not aware that your programming language considers {}
to be special, we don't try to do language detection, or to semantically
parse the program.

It's a general merge driver on text lines that works the same whether
you have a language like C# that uses {} braces, or a language like
Emacs Lisp which does not.

There's particular common cases where this logic goes "wrong", I've run
into it the most with repetitive declarations like:

    {
        {
            description => "some thingy",
            callback    => function { foo },
            strict      => 1,
            warn        => 1,
        },
        [... lots of these omitted ... ]
        {
            description => "other thingy",
            callback    => function { bar },
            strict      => 1,
            warn        => 1,
        },
    },

I didn't bother to check this specific example, but in cases *like that*
the merge driver will often append "duplicates" when two branches added
the same "other thingy", since the boilerplate at the end (or beginning,
depending) is repetitive, so a duplication becomes indistinguishable
from an addition for a naïve merge driver).

You can define your own merge driver that's aware of your language, I
think this is probably a too complex and Bad Idea in general.

Custom merge drivers are very useful for e.g. the git-annex case, which
ends up merging really simple "log" files. merges there are always
equivalent to basically a "sort -u". I.e. keep all lines added, remove
duplicates.

But for a programming language a "smart merge" is, I'd like to submit,
simply an impossible task. Even if you had perfect AI you couldn't do
it, even if I had a clone of myself from yesterday we probably couldn't
agree on how to solve all merges.

That's because once you get past the simple cases a merge resolution is
something that requires judgement calls from the programmer. E.g. I
worked on a topic branch, and now I've got a conflict because someone
changed the function signature. I can either do the bare minimum and use
some compatibility interface today, or convert all my work to the "new
API" and not have to convert from the legacy API in the future.

Either one would be a valid resolution, which the perfect AI, or even my
clone from yesterday might do differently.

But most importantly having a textual conflict in a program when you
merge/rebase is almost always the trivial case, having a semantic
conflict is something you always need to check for.

Git (or merge tools in general) can't help you with that, because your
"conflict" is in a conflict between the expectations of your topic
branch, and whether or not they hold given whatever's happened on an
advancing upstream.

So whether you have textual conflicts on merge/rebase from git or not,
your workflow really should be to always assume that you have a semantic
conflict, unless you're already completely familiar with the new code
you're merging into your branch.

I.e. after a merge/rebase look at your patches again to see if they make
sense given what changed on the upstream, compile, run the tests you
have etc.