On Tue, 7 Jun 2011, Jeff King wrote: > On Tue, Jun 07, 2011 at 10:12:35PM +0000, Albretch Mueller wrote: > > > > ... binary diffs, though I don't know offhand the details of the algorithm. > > ~ > > this is the part that I need ;-) > > ~ > > Reading the source code without knowing the basic underlying > > ideas/algorithm (just an outline if possible) won't help much > > You could read the comments in the source: > > $ head -n 7 diff-delta.c > /* > * diff-delta.c: generate a delta between two buffers > * > * This code was greatly inspired by parts of LibXDiff from Davide Libenzi > * http://www.xmailserver.org/xdiff-lib.html > * > * Rewritten for GIT by Nicolas Pitre <nico@xxxxxxxxxxx>, (C) 2005-2007 > > According to the xdiff page linked: > > For binary files, LibXDiff implements both (with some modification) > the algorithm described in File System Support for Delta Compression > by Joshua P. MacDonald, and the algorithm described in Fingerprinting > By Random Polynomials by Michael O. Rabin. > > Nicolas (cc'd) might be able to say what, if any, substantive changes > were made from those works. The libxdiff code was pretty generic so to be highly portable and usable for many application types. What I did is to get rid of everything that git strictly didn't need in order to make the code as simple as possible, and most importantly as fast as possible. And then the code was optimized even further, sacrificing on clarity a bit, to make it even faster. Since this code is the very inner loop of every delta search for best delta matches, every bit of optimization counts. And then further modifications were made to avoid pathological corner cases which were taking too much time for little gain in the Git context. I also changed the output encoding to make it tighter. So, strictly speaking, the current code in Git doesn't bear any resemblance with the libxdiff code at all. However the basic algorithm behind both implementations is the same. Studying the libxdiff version is probably easier in order to gain an understanding of how this works. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html