Geert Bosch wrote:
On Apr 4, 2007, at 09:34, Rogan Dawes wrote:
For binary files, it would be consistent to show the number of bytes
added/deleted. I have not investigated the output format for the
libxdiff binary patch format, but hopefully it would not be too
difficult to calculate the deletions and additions.
For binary files it is impractical to do insert/delete type of differences.
For text files, treating lines as indivisible entities to insert/delete
make some sense. For binary files, you'd have to use some arbitrary
context-defined breakpoints and then go from there. The result would
be some very complicated and unclear algorithm that would have no use
in the real world.
Many binary files, such as an images, waveforms or virtually any compressed
stream, can change in a way that changes all bytes in the file, while
the changes in the displayed image or the uncompressed stream are
imperceptible or absent. Guessing semantic differences between binary
blobs is hopeless and subjective, while differences in size are fact.
-Geert
As per my mail to Andy, we *already* do this for text files. e.g. wrap
an XML document in an additional tag, and update the indentation to match.
The semantic change is minimal (perhaps 2 new lines), but the reported
change reflects n lines deleted, and n+2 added.
Exactly because we *don't* do any semantic analysis (for text or binary
files), we should simply report the number of bytes changed, exactly as
we do for text files (reporting number of lines changed). This is
_consistent_ with what we do currently for text files.
Note that Andy's apparent preference (to know how the sizes have
changed) can still largely be satisfied by this approach.
somefile.bin | 1000 -> 1000 bytes
and
somefile.bin | 500 bytes removed, 500 bytes added
You can still see that the overall size of the file has not changed, but
you get the additional information about how many bytes were actually
changed at the same time, which you don't get just showing the sizes.
Rogan
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html