Re: Pack files, standards compliance, and efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 05, 2015 at 06:36:39AM -0400, Jeff King wrote:
> On Fri, Jun 05, 2015 at 05:14:25PM +0700, Duy Nguyen wrote:
> 
> > I'm more concerned about breaking object_id abstraction than C
> > standard. Let's think a bit about future. I suppose we need to support
> > both sha-1 and sha-512, at least at the source code level.
> 
> I think that's going to be a much bigger issue, because we are casting
> out of a defined, on-disk data structure here. So I'd rather defer any
> code changes around this until we see what the new data structure (and
> the new code) look like.

My plan is to change the data as little as possible.  I want to set
core.repositoryformatversion to 1 and create core.hashalgorithm =
sha-256 or sha-512 or whatever.  If core.repositoryformatversion is 0,
then core.hashalgorithm is assumed to be sha-1.

Packs will get a version number bump to 5 and acquire a 32-byte
NUL-padded algorithm descriptor after the version field.  The network
protocol will acquire a capability hash=sha-256.  git init will get a
--hash option, without which it will initialize SHA-1.

I don't intend to change the contents of struct object_id any, since I
don't intend to allow mixed hashes in one repository (git fast-import is
your friend).  I plan to read the format version and hash algorithm as
soon as possible after startup and initialize a variable with the hash
length.  The length of the struct's hash member will expand to handle
whatever the maximum supported hash size is, but data will only be
compared and modified up to the hash length of the appropriate
algorithm.

This does lead to the possibility of increased memory usage, which is
why I plan to initially only support SHA-512/256.  It's 32 bytes, like
SHA-256, but it performs much better on 64-bit systems (SHA-1: 291
MiB/s, SHA-256: 144 MiB/s, SHA-512/256: 242 MiB/s) for messages ≥ 55
bytes, and most systems these days are 64 bit.

That's what I've been thinking, at least, but if people have better
ideas, I'm open to hearing them.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]