Re: Why SHA are 40 bytes? (aka looking for flames)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 24 Apr 2007, Andreas Ericsson wrote:

> Nicolas Pitre wrote:
> > On Tue, 24 Apr 2007, Andreas Ericsson wrote:
> > 
> >> Using a more efficient compression algorithm for the objects 
> >> themselves (bzip2, anyone?) will most likely reduce storage size an 
> >> order of magnitude more than reducing the size of the hash, although 
> >> at the expense of CPU-efficiency.
> > 
> > An order of magnitude I really doubt it.  Maybe 20% could be a really 
> > optimistic prediction.  But if bzip2 could reduce the repo by 20%, it 
> > will slow runtime usage of that repo by maybe 100%.  That is not worth 
> > it.
> > 
> > This is also the reason why we changed the default zlib compression 
> > level from "best" to "default".
> > 
> 
> ... order of magnitude *more than reducing the size of the hash*.

Ah, sorry.  Still I wanted to advise against it nevertheless.

Anyway, if you compare packing with repack.usedeltabaseoffset set to 
true, then to false, you'll have a pretty good approximation of the SHA1 
storage cost.  With most objects as deltas, in the first case the 
reference to the base object is stored as an offset in the pack while in 
the second case the full SHA1 is used.  You can deduce from the size 
difference that the SHA1 doesn't constitute a terrible storage cost 
indeed.


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]