Re: Unresolved issues #2 (shallow clone again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 07, 2006 at 05:50:42PM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 7 May 2006, Theodore Tso wrote:
> >> 
> > If there are 233338 objects, then the average wasted space due to
> > internal fragmentation is 233338 * 2k, or 466676 kilobytes, or only
> > 36% of the wasted space.
> 
> That's not necessarily true.
> 
> That assumes a randomly distributed filesize. File sizes are _not_ random, 
> and in particular if you have the distribution leaning towards <2kB being 
> common, you can actually get >50% fragmentation.
> 
> Btw, I hit this when some people argued that the page size should be made 
> 64kB. The above (incorrect) logic implies that you waste 32kB on average 
> per file. That's not true, if a large fraction of your files are small, in 
> which case you may actually be wastign closer to 60kB on average from 
> using a big page-size, because about half of the kernel files are actually 
> smaller than 4kB (or something. I forget the exact statistics, I did them 
> with a script at some point).
> 
> Anyway, with inode overhead and a lot of objects being just a couple of 
> hundred bytes, I think I estimated at some point that you actually lost 
> closer to 3kB per object.

I just ran the numbers on filesizes of a kernel tree I had handy,
which happened to be 2.6.16.11.  With no object files, git files,
etc. the average loss was 2351 bytes --- not that far away from the
average of 2048 bytes.  Granted, it may be there is more different
versions of small objects causing a skewing of the distributions of
git objects in the 2.6 tree, but I'm not familiar enough with the git
porcelain to be able to make it disgorge the sizes of the repository
to do the math.

						- Ted

-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]