Re: A better approach to diffing and merging

Karl Hasselström <kha@xxxxxxxxxxx> · Mon, 1 Dec 2008 10:54:49 +0100

On 2008-11-29 17:56:44 -0800, Brian Dessent wrote:

> Ian Clarke wrote:
>
> > Provide the merge algorithm with the grammar of the programming
> > language, perhaps in the form of a Bison grammar file, or some
> > other standardized way to represent a grammar.
>
> There's a huge flaw in that approach for C/C++: in order to parse
> C/C++ you have to first preprocess it -- consider the twisty mazes
> that #ifdef/#else/#endif can create. But in order to preprocess
> source code you need a whole heap of extra information that is not
> in the repository (or if it is, cannot be automatically extracted.)

But it's probably not necessary to parse the input files exactly. All
you have to do is parse it well enough that the diff of the parse
trees is interesting.

And in practice, you'd probably also generate the "normal" diff, and
then fall back to that one if the parse tree diff was worse.

> The idea may have value for langauges that are easy to parse and do
> not have all this preprocessor cruft, but I just don't see how it
> would be able to provide anything useful for non-trivial changes to
> real world C/C++, which require human eyes to decipher.

I think it could work. But there would be quite a bit of heuristics
involved to get the "approximate" parsing right, so I'm pretty sure
there's no way to find out without actually trying to build the thing.

-- 
Karl Hasselström, kha@xxxxxxxxxxx
      www.treskal.com/kalle
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html