On Fri, 10 Jun 2011, Jakub Narebski wrote: > Nicolas Pitre <nico@xxxxxxxxxxx> writes: > > The libxdiff code was pretty generic so to be highly portable and usable > > for many application types. What I did is to get rid of everything that > > git strictly didn't need in order to make the code as simple as > > possible, and most importantly as fast as possible. [...] > > > > And then further modifications were made to avoid pathological corner > > cases which were taking too much time for little gain in the Git > > context. > > > > I also changed the output encoding to make it tighter. > > Nicolas, do you know how binary diff used by git compares with respect > to performance and compression with other binary diff algorithms: > > * original LibXDiff > * bsdiff > * xdelta (vcdif algorithm) > * vbindiff No idea. But you can test that pretty easily if you wish. I would be interested in the results of course. Just do: make test-delta and then, to compress something: ./test-delta -d <input_file> <reference_file> <output_file> Of course this will produce <output_file> which is only the bare binary diff annotation data against the reference file. In Git that output is also deflated with zlib, before it is stored in a pack. The other binary diff tools are usually doing a similar post-deflation pass as well. It should be noted that the algorithm that Git uses won't produce the absolute smallest output. When I tried that, computation time went up of course, but surprizingly the final deflated result was slightly _larger_ as well, probably due to the fact that zlib was less efficient with the increased randomness in the tighter delta output to deflate. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html