Re: [PATCH] xdiff-interface.c (buffer_is_binary): Remove buffer size limitation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 4 Dec 2007, Dmitry V. Levin wrote:
>
> Average file size in the linux-2.6.23.9 kernel tree is 10944 bytes,

Don't do "average" sizes. That's an almost totally meaningless number.

"Average" makes sense if you have some kind of gaussian distribution or 
similar. File sizes tend to be exponential distributions, and what makes 
much more sense is to look at the median. That doesn't show the effect of 
a few larger files, and also gives you a much better "half the files are 
smaller than x" idea.

And the median filesize for the kernel is just a few bytes over 4k.

Of the 23,000+ files in the current kernel, about 15,500 are less than 
8kB. And 17,179 are smaller than the 10944 bytes you mention.

I'd argue that 8kB (or even 4kB) is probably a good number for things like 
that: it catches the bulk of all files in their entirety, but it *avoids* 
spending tons of time on the (few) really large files.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux