Hi Jonathan, On Tue, Sep 26, 2017 at 04:51:58PM -0700, Jonathan Nieder wrote: > Johannes Schindelin wrote: > > On Tue, 26 Sep 2017, Jason Cooper wrote: > >> For my use cases, as a user of git, I have a plan to maintain provable > >> integrity of existing objects stored in git under sha1 while migrating > >> away from sha1. The same plan works for migrating away from SHA2 or > >> SHA3 when the time comes. > > > > Please do not make the mistake of taking your use case to be a template > > for everybody's use case. > > That said, I'm curious at what plan you are alluding to. Is it > something that could benefit others on the list? Well, it's just a plan at this point. As there's a lot of other work to do in the mean-time, and there's no possibility of transitioning until the dust has settled on NEWHASH. :-) Given an existing repository that needs to migrate from SHA1 to NEWHASH, and maintain backwards compatibility with clients that haven't migrated yet, how do we a) perform that migration, b) allow non-updated clients to use the data prior to the switch, and c) maintain provable integrity of the old objects as well as the new. The primary method is counter-hashing, which re-uses the blobs, and creates parallel, deterministic tree, commit, and tag objects using NEWHASH for everything up to flag day. post-flag-day only uses NEWHASH. A PGP "transition" key is used to counter-sign the NEWHASH version of the old signed tags. The transition key is not required to be different than the existing maintainers key. A critical feature is the ability of entities other than the maintainer to migrate to NEWHASH. For example, let's say that git has fully implemented and tested NEWHASH. linux.git intends to migrate, but it's going to take several months (get all the developers herded up). In the interim, a security company, relying on Linux for it's products can counter-hash Linus' repo, and continue to do so every time he updates his tree. This shrinks the attack window for an entity (with an undisclosed break of SHA1) down to a few minutes to an hour. Otherwise, a check of the counter hashes in the future would reveal the substitution. The deterministic feature is critical here because there is valuable integrity and trust built by counter-hashing quickly after publication. So once Linux migrates to NEWHASH, the hashes calculated by the security company should be identical. IOW, use the timestamps that are in the SHA1 commit objects for the NEWHASH objects. Which should be obvious, but it's worth explicitly mentioning that determinism provides great value. We're in the process of writing this up formally, which will provide a lot more detail and rationale that this quick stream of thought. :-) I'm sure a lot of this has already been discussed on the list. If so, I apologize for being repetitive. Unfortunately, I'm not able to keep up with the MLs like I used to. thx, Jason.