Re: [PATCH] Show binary file size change in diff --stat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Geert Bosch wrote:

On Apr 4, 2007, at 09:34, Rogan Dawes wrote:

For binary files, it would be consistent to show the number of bytes added/deleted. I have not investigated the output format for the libxdiff binary patch format, but hopefully it would not be too difficult to calculate the deletions and additions.

For binary files it is impractical to do insert/delete type of differences.
For text files, treating lines as indivisible entities to insert/delete
make some sense. For binary files, you'd have to use some arbitrary
context-defined breakpoints and then go from there. The result would
be some very complicated and unclear algorithm that would have no use
in the real world.

Many binary files, such as an images, waveforms or virtually any compressed
stream, can change in a way that changes all bytes in the file, while
the changes in the displayed image or the uncompressed stream are
imperceptible or absent. Guessing semantic differences between binary
blobs is hopeless and subjective, while differences in size are fact.

  -Geert

As per my mail to Andy, we *already* do this for text files. e.g. wrap an XML document in an additional tag, and update the indentation to match.

The semantic change is minimal (perhaps 2 new lines), but the reported change reflects n lines deleted, and n+2 added.

Exactly because we *don't* do any semantic analysis (for text or binary files), we should simply report the number of bytes changed, exactly as we do for text files (reporting number of lines changed). This is _consistent_ with what we do currently for text files.

Note that Andy's apparent preference (to know how the sizes have changed) can still largely be satisfied by this approach.

 somefile.bin  | 1000 -> 1000 bytes

and

 somefile.bin  | 500 bytes removed, 500 bytes added

You can still see that the overall size of the file has not changed, but you get the additional information about how many bytes were actually changed at the same time, which you don't get just showing the sizes.

Rogan
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]