Re: RFC v3: Another proposed hash function transition plan

Brandon Williams <bmwill@xxxxxxxxxx> · Mon, 2 Oct 2017 09:50:30 -0700

On 10/02, Jason Cooper wrote:
> Hi Jonathan,
> 
> On Tue, Sep 26, 2017 at 04:51:58PM -0700, Jonathan Nieder wrote:
> > Johannes Schindelin wrote:
> > > On Tue, 26 Sep 2017, Jason Cooper wrote:
> > >> For my use cases, as a user of git, I have a plan to maintain provable
> > >> integrity of existing objects stored in git under sha1 while migrating
> > >> away from sha1.  The same plan works for migrating away from SHA2 or
> > >> SHA3 when the time comes.
> > >
> > > Please do not make the mistake of taking your use case to be a template
> > > for everybody's use case.
> > 
> > That said, I'm curious at what plan you are alluding to.  Is it
> > something that could benefit others on the list?
> 
> Well, it's just a plan at this point.  As there's a lot of other work to
> do in the mean-time, and there's no possibility of transitioning until
> the dust has settled on NEWHASH.  :-)
> 
> Given an existing repository that needs to migrate from SHA1 to NEWHASH,
> and maintain backwards compatibility with clients that haven't migrated
> yet, how do we
> 
>   a) perform that migration,
>   b) allow non-updated clients to use the data prior to the switch, and
>   c) maintain provable integrity of the old objects as well as the new.
> 
> The primary method is counter-hashing, which re-uses the blobs, and
> creates parallel, deterministic tree, commit, and tag objects using
> NEWHASH for everything up to flag day.  post-flag-day only uses NEWHASH.
> A PGP "transition" key is used to counter-sign the NEWHASH version of
> the old signed tags.  The transition key is not required to be different
> than the existing maintainers key.
> 
> A critical feature is the ability of entities other than the maintainer
> to migrate to NEWHASH.  For example, let's say that git has fully
> implemented and tested NEWHASH.  linux.git intends to migrate, but it's
> going to take several months (get all the developers herded up).
> 
> In the interim, a security company, relying on Linux for it's products
> can counter-hash Linus' repo, and continue to do so every time he
> updates his tree.  This shrinks the attack window for an entity (with an
> undisclosed break of SHA1) down to a few minutes to an hour.  Otherwise,
> a check of the counter hashes in the future would reveal the
> substitution.
> 
> The deterministic feature is critical here because there is valuable
> integrity and trust built by counter-hashing quickly after publication.
> So once Linux migrates to NEWHASH, the hashes calculated by the security
> company should be identical.  IOW, use the timestamps that are in the
> SHA1 commit objects for the NEWHASH objects.  Which should be obvious,
> but it's worth explicitly mentioning that determinism provides great
> value.
> 
> We're in the process of writing this up formally, which will provide a
> lot more detail and rationale that this quick stream of thought.  :-)
> 
> I'm sure a lot of this has already been discussed on the list.  If so, I
> apologize for being repetitive.  Unfortunately, I'm not able to keep up
> with the MLs like I used to.
> 
> thx,
> 
> Jason.

Given the interests that you've expressed here I'd recommend taking a
look at
https://public-inbox.org/git/20170928044320.GA84719@xxxxxxxxxxxxxxxxxxxxxxxxx/
which is the current version of the transition plan that the community
has settled on
(https://public-inbox.org/git/xmqqlgkyxgvq.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/
shows that it should be merged to 'next' soon).  Once neat aspect of
this transition plan is that it doesn't require a flag day but rather
anyone can migrate to the new hash function and still interact with
repositories (via the wire) which are still running SHA1.

-- 
Brandon Williams