Re: [PATCH] Put sha1dc on a diet

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 01, 2017 at 12:14:34PM -0800, Linus Torvalds wrote:

> > My biggest concern is the index-pack operation. Try this:
> 
> I'm mobile right now, so I can't test, but I'd this perhaps at least partly
> due to the full checksum over the pack-file?
>
> We have two very different uses of SHA1: the actual object name hash, but
> also the sha1file checksums that we do on the index file and the pack files.
>
> And the checksum code really doesn't need the collision checking at all.

I don't think that helps. The sha1 over the pack-file takes about 1.3s
with openssl, and 5s with sha1dc. So we already know the increase there
is only a few seconds, not a few minutes.

And it makes sense if you think about the index-pack operation. It has
to inflate each object, resolving deltas, and checksum the result. And
the number of inflated bytes is _much_ larger than the on-disk bytes.
You can see the difference with:

  git cat-file --batch-all-objects \
    --batch-check='%(objectsize:disk) %(objectsize)' |
  perl -alne '
    $disk += $F[0]; $raw += $F[1];
    END { print "$disk $raw" }
  '

On linux.git that yields:

  1210521959 63279680406

That's over a 50x increase in the bytes we have to sha1 for objects
versus pack-checksums.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]