"Ian Clarke" <ian.clarke@xxxxxxxxx> writes: > Apologies if this is off-topic, but I recently had an idea for a > better way to do diffs and merging which I thought may be of interest > to those on this list. [...] > While I'm no merging expert, it seems that most merging algorithms do > it on a line-by-line basis, treating source code as nothing but a list > of lines of text. It got me thinking, what if the merging algorithm > understood the structure of the source code it is trying to merge? > > So the idea is this: > > Provide the merge algorithm with the grammar of the programming > language, perhaps in the form of a Bison grammar file, or some other > standardized way to represent a grammar. > > The merge algorithm then uses this to parse the files to be diffed > and/or merged into trees, and then the diff and merge are treated as > operations on these trees. These operations may include creating, > deleting, or moving nodes or branches, renaming nodes, etc. There has > been quite a bit (pdf) of academic research on this topic, although I > haven't yet found off-the-shelf code that will do what we need. First, as Brian Dessent said it would be hard to generate parse tree in the presence of compile-time configuration (using preprocessor in C/C++, but in principle this applies to programs in any language; not only you have to know conditionals, but also compile options). And for dynamic languages you would have to take care about self-modifying programs. Second, from what I understand we have _good_, established algorithms for merging sequences (which includes sequence of lines, or sequence of words), and for merging special kinds of trees that are representations of directory structure. I haven't read link to mentioned research, but I think that it is still unproven research, and not something well established and well tested. Third, it would require embedding knowledge about various programming languages (including C, shell, Perl, TeX) and document formats (including XML, HTML, AsciiDoc) in version control system... > Still, it shouldn't be terribly hard to implement. So, try to provide us with some proof-of-concept patches, then. -- Jakub Narebski Poland ShadeHawk on #git -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html