"Randal L. Schwartz" <merlyn@xxxxxxxxxxxxxx> wrote: > >>>>> "Nicolas" == Nicolas Pitre <nico@xxxxxxx> writes: > > >> IIRC bsdiff is used by Firefox to distribute binary software updates. > >> Xdelta is generic (not optimized for binaries like bsdiff and edelta), but > >> supposedly offers worse compression (bigger diffs). > > Nicolas> We already have our own delta code for pack storage. > > I think the issue is related to being able to cherry-pick and merge > when binaries are involved. I've been worried about that myself. > How well are binaries supported these days for all the operations > we're taking for granted? When is a "diff" expected to be a real > "diff" and not just "binary files differ"? The clearly safe approach is to include the full SHA1 ID of the old object the patch was created from and use the xdelta in the patch only as a means of transporting a compressed form of the new version of the object. If git-diff starts to export say a base 64 encoding of the xdelta then it should also include the full SHA1 ID for binary files, even if --full-index wasn't given. git-apply should only apply an xdelta patch to the exact same old object. If the tree currently has a different object at that path then reject the patch entirely. If a path has a different object then the patch was based on then we can do one of two things to be ``nice'' to the human: - If the old blob exists in the repository (it just isn't the current version at that path) then generate a temporary merge file holding the old blob with the delta applied. The user can then finish the merge with whatever tool understands that binary file format, or do the merge by hand. - Supply a ``do it anyway'' flag to git-apply. If this flag is given on the command line then the binary file is patched even though the object versions differ. For some binary file formats this may actually be a valid thing to do. But it probably isn't for a very large percentage of known file formats. I could see some cases where it might be nice to be able to perform specialized merge handling of binary files via hooks or filters. For example *.tar.gz, *.zip, *.jar - these files are all just compressed trees. They should be somewhat mergeable with the same semantics as other trees in GIT. Of course one could just unpack these into a directory and let GIT track the directory instead, but this is rather inconvenient in a Java project. :-) If I recall correctly OpenOffice document files are XML compressed into ZIP archives. The XML *might* diff/patch cleanly as plain text. The other resources in that archive are typically binary graphic files and the like, which of course wouldn't diff/patch nicely. But being able to diff/patch the main content might be semi-useful. -- Shawn. - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html