Re: A better approach to diffing and merging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Ian Clarke" <ian.clarke@xxxxxxxxx> writes:

> Apologies if this is off-topic, but I recently had an idea for a
> better way to do diffs and merging which I thought may be of interest
> to those on this list.

[...]
> While I'm no merging expert, it seems that most merging algorithms do
> it on a line-by-line basis, treating source code as nothing but a list
> of lines of text.  It got me thinking, what if the merging algorithm
> understood the structure of the source code it is trying to merge?
> 
> So the idea is this:
> 
> Provide the merge algorithm with the grammar of the programming
> language, perhaps in the form of a Bison grammar file, or some other
> standardized way to represent a grammar.
> 
> The merge algorithm then uses this to parse the files to be diffed
> and/or merged into trees, and then the diff and merge are treated as
> operations on these trees.  These operations may include creating,
> deleting, or moving nodes or branches, renaming nodes, etc.  There has
> been quite a bit (pdf) of academic research on this topic, although I
> haven't yet found off-the-shelf code that will do what we need.

First, as Brian Dessent said it would be hard to generate parse tree
in the presence of compile-time configuration (using preprocessor
in C/C++, but in principle this applies to programs in any language;
not only you have to know conditionals, but also compile options).
And for dynamic languages you would have to take care about
self-modifying programs.

Second, from what I understand we have _good_, established algorithms
for merging sequences (which includes sequence of lines, or sequence
of words), and for merging special kinds of trees that are
representations of directory structure.  I haven't read link to
mentioned research, but I think that it is still unproven research,
and not something well established and well tested.

Third, it would require embedding knowledge about various programming
languages (including C, shell, Perl, TeX) and document formats
(including XML, HTML, AsciiDoc) in version control system...

> Still, it shouldn't be terribly hard to implement.

So, try to provide us with some proof-of-concept patches, then.
-- 
Jakub Narebski
Poland
ShadeHawk on #git
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux