Hi Kostis, On Mon, 13 Mar 2017, ankostis wrote: > On 13 March 2017 at 18:48, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > > > > The Keccak Team wrote: > > > > > We have read your transition plan to move away from SHA-1 and > > > noticed your intent to use SHA3-256 as the new hash function in the > > > new Git repository format and protocol. Although this is a valid > > > choice, we think that the new SHA-3 standard proposes alternatives > > > that may also be interesting for your use cases. As designers of > > > the Keccak function family, we thought we could jump in the mail > > > thread and present these alternatives. > > > > I indeed had some reservations about SHA3-256's performance. The main > > hash function we had in mind to compare against is blake2bp-256. This > > overview of other functions to compare against should end up being > > very helpful. > > What if some of us need this extra difficulty, and don't mind about the > performance tax, because we need to refer to hashes 10 or 30 years from > now, or even in the Post Quantum era? If you need this extra difficulty, and if this extra difficulty would imply a huge penalty for everybody else, it is safe to assume that that extra difficulty would need to be an extra switch, off by default. It simply shows that we put too much of a burden on SHA-1: we used it for three separate purposes: to verify data integrity, to allow addressing objects by their own content, and for signing entire commit histories cryptographically (more as an afterthought, as I see it: the Linux project provides the context where you never fetch from any untrusted source, therefore cryptographically secure signatures are not quite as important as the trust between maintainer and lieutenants). We *will* have to separate those concerns, and maybe even switch to different algorithms for the different concerns. There are much better algorithms for validating data integrity, for example, including error correction (which SHA-1 never wanted to do anyway). In your case, I could imagine that you would simply require verifiable cryptographic signatures (.asc files) to be committed together with the documents; it would be much harder to find a collision where those signatures still match (or a double collision where the forged document's signature would collide with the non-forget document's signature, in addition to the two documents colliding). Another idea would be to use Jonathan Nieder's proposed transition plan and simply extend it. That transition plan details how the objects would be hashed with two algorithms locally and how to maintain a bidirectional mapping between the two. You could simply piggyback on that code and provide patches that allow for a third, configurable algorithm, and that algorithm's hashes would simply be added to the commit objects and fsck would then know to verify those, too. That would be an opt-in feature, of course, so that only those who need the extra long term security have to pay the price of a substantially slower hashing. What we cannot do is to pick a super slow hash algorithm just to cater to the use case where legal documents are managed, punishing everybody else for using Git in the intended way: to manage source code. Ciao, Johannes