Re: [PATCH 2/2] doc hash-function-transition: pick SHA-256 as NewHash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:

>> The consensus on the mailing list seems to be that SHA-256 should be
>> picked as our NewHash, see the "Hash algorithm analysis" thread as of
>> [1]. Linus has come around to this choice and suggested Junio make the
>> final pick, and he's endorsed SHA-256 [3].

I think this commit message focuses too much on the development
process, in a way that makes it not necessary useful to the target
audience that would be finding it with "git blame" or "git log".  It's
also not self-contained, which makes it less useful in the same way.

In other words, the commit message should be speaking for the project,
not speaking about the project.  I would be tempted to say something as
simple as

 hash-function-transition: pick SHA-256 as NewHash

 The project has decided.

 Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>

and let any Acked-bys on the message speak for themselves.
Alternatively, the commit message could include a summary of the
discussion:

 From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256,
 K12, and so on are all believed to have similar security properties.
 All are good options from a security point of view.

 SHA-256 has a number of advantages:

 * It has been around for a while, is widely used, and is supported by
   just about every single crypto library (OpenSSL, mbedTLS, CryptoNG,
   SecureTransport, etc).

 * When you compare against SHA1DC, most vectorized SHA-256
   implementations are indeed faster, even without acceleration.

 * If we're doing signatures with OpenPGP (or even, I suppose, CMS),
   we're going to be using SHA-2, so it doesn't make sense to have our
   security depend on two separate algorithms when either one of them
   alone could break the security when we could just depend on one.

 So SHA-256 it is.

[...]
>> @@ -125,19 +122,19 @@ Detailed Design
>>  ---------------
>>  Repository format extension
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> -A NewHash repository uses repository format version `1` (see
>> +A SHA-256 repository uses repository format version `1` (see
>>  Documentation/technical/repository-version.txt) with extensions
>>  `objectFormat` and `compatObjectFormat`:
>>  
>>  	[core]
>>  		repositoryFormatVersion = 1
>>  	[extensions]
>> -		objectFormat = newhash
>> +		objectFormat = sha256
>>  		compatObjectFormat = sha1
>
> Whenever we said SHA1, somebody came and told us that the name of
> the hash is SHA-1 (with dash).  Would we be nitpicker-prone in the
> same way with "sha256" here?

Regardless of how we spell it in prose, I think `sha256` as an
identifier in configuration is the spelling people will expect.  For
example, gpg ("gpg --version") calls it SHA256.

[...]
>>  Selection of a New Hash
>>  -----------------------
>> @@ -611,6 +608,10 @@ collisions in 2^69 operations. In August they published details.
>>  Luckily, no practical demonstrations of a collision in full SHA-1 were
>>  published until 10 years later, in 2017.
>>  
>> +It was decided that Git needed to transition to a new hash
>> +function. Initially no decision was made as to what function this was,
>> +the "NewHash" placeholder name was picked to describe it.
>> +
>>  The hash function NewHash to replace SHA-1 should be stronger than
>>  SHA-1 was: we would like it to be trustworthy and useful in practice
>>  for at least 10 years.
>
> This sentence needs a bit of updating to match the new paragraph
> inserted above.  "should be stronger" is something said by those
> who are still looking for one and/or trying to decide.

For what it's worth, I would be in favor of modifying the section
more heavily.  For example:

 Choice of Hash
 --------------
 In early 2005, around the time that Git was written,  Xiaoyun Wang,
 Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
 collisions in 2^69 operations. In August they published details.
 Luckily, no practical demonstrations of a collision in full SHA-1 were
 published until 10 years later, in 2017.

 Git v2.13.0 and later subsequently moved to a hardened SHA-1
 implementation by default that mitigates the SHAttered attack, but
 SHA-1 is still believed to be weak.

 The hash to replace this hardened SHA-1 should be stronger than SHA-1
 was: we would like it to be trustworthy and useful in practice
 for at least 10 years.

 Some other relevant properties:

 1. A 256-bit hash (long enough to match common security practice; not
    excessively long to hurt performance and disk usage).

 2. High quality implementations should be widely available (e.g., in
    OpenSSL and Apple CommonCrypto).

 3. The hash function's properties should match Git's needs (e.g. Git
    requires collision and 2nd preimage resistance and does not require
    length extension resistance).

 4. As a tiebreaker, the hash should be fast to compute (fortunately
    many contenders are faster than SHA-1).

 We choose SHA-256.

Changes:

- retitled since the hash function has already been selected
- added some notes about sha1dc
- when discussing wide implementation availability, mentioned
  CommonCrypto too, as an example of a non-OpenSSL library that the
  libgit2 authors care about
- named which function is chosen

We could put the runners up in the "alternatives considered" section,
but I don't think there's much to say about them here so I wouldn't.

Thanks and hope that helps,
Jonathan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux