Re: diff'ing files ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 10 Jun 2011, Jakub Narebski wrote:

> Nicolas Pitre <nico@xxxxxxxxxxx> writes:
> > The libxdiff code was pretty generic so to be highly portable and usable 
> > for many application types.  What I did is to get rid of everything that 
> > git strictly didn't need in order to make the code as simple as 
> > possible, and most importantly as fast as possible. [...]
> > 
> > And then further modifications were made to avoid pathological corner 
> > cases which were taking too much time for little gain in the Git 
> > context.
> > 
> > I also changed the output encoding to make it tighter.
> 
> Nicolas, do you know how binary diff used by git compares with respect
> to performance and compression with other binary diff algorithms:
> 
>   * original LibXDiff
>   * bsdiff
>   * xdelta (vcdif algorithm)
>   * vbindiff

No idea.  But you can test that pretty easily if you wish.  I would be 
interested in the results of course. Just do:

	make test-delta

and then, to compress something:

	./test-delta -d <input_file> <reference_file> <output_file>

Of course this will produce <output_file> which is only the bare binary 
diff annotation data against the reference file.  In Git that output is 
also deflated with zlib, before it is stored in a pack.  The other 
binary diff tools are usually doing a similar post-deflation pass as 
well.

It should be noted that the algorithm that Git uses won't produce the 
absolute smallest output.  When I tried that, computation time went up 
of course, but surprizingly the final deflated result was slightly 
_larger_ as well, probably due to the fact that zlib was less efficient 
with the increased randomness in the tighter delta output to deflate.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]