Re: [PATCH] Show binary file size change in diff --stat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Wed, 4 Apr 2007, Johannes Schindelin wrote:
>> 
>> The subtle difference: your approach is _expensive_ in terms of CPU time, 
>> while the byte change approach is _dirt cheap_.
>
> Well, you could do a combination (still dirt cheap):
>  - show the size before/after (and yes, new/delete should be separate from 
>    "zero size before/after")
>  - show the size of the binary patch.
>
> No "X added bytes" vs "Y bytes deleted", just "size of binary patch". It 
> could be really small, even if 10k was deleted, or the file was totally 
> re-organized by moving chunks around.
>
> It would still be a meaningful thing to know - if only because it tells 
> you how much space the delta takes.

I agree wrt the kind of information to give, except that I am
not so sure about new/delete vs zero before/after.  We do not do
that for a text file, and when people do care about the
distinction, they would use --summary.

I often have wished that we could make --stat imply --summary;
the only reason we did not do that is because the --stat option
started its life as an imitation of "diffstat".

I've seen our diff-delta change its output size once.  It was a
nice improvement to make the delta much smaller than before, but
I had to rewrite the rename similarity in diffcore not to depend
on the diff-delta algorithm change, to keep it stable across
diff-delta improvements (the alternative was to futz with the
default threshold).  I suspect we might see similar confusion if
we show the delta size, depending on people's expectations.
This is a very minor issue, but I thought I should mention it.

There is a machine readable output format for the same --stat
information called --numstat.  It currently signals the
binary-ness by showing '-' instead of line count.  We could
extend it by showing '-' + number of bytes.

So here are some more suggestions:

 (1) --stat for binary files to show preimage and postimage
     sizes like this (if we were to do delta size -- otherwise
     drop " (.*" at the end):

	penguin.jpg |  Bin 745245 -> 660689 (delta: 4434)

 (2) --numstat for binary files to show preimage and postimage
     sizes like this:

	penguin.jpg	-745245	-660689

 (3) independent from all of the above, make --stat imply
     --summary and perhaps introduce --no-summary if people do not
     want --summary given when they say --stat;


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]