Junio C Hamano wrote: > Continuing this thought process, I do not see a good way to allow us > to wean ourselves off of the old hash, unless we _break_ the pack > stream format so that each object in the pack carries not just the > data but also the hash algorithm to be used to _name_ it, so that > new objects will never be referred to using the old hash. Taking a step further: I don't think that any backward-compatible format change would address the security concerns with sufficiently old hashing algorithms. As long as my favorite repository is allowed to contain objects identified by SHA-1, my adversary can exploit a SHA-1 collision using signed tags referring (possibly indirectly) to backdated objects. The Git object format does not include a proof of commit date, so I cannot guarantee "Only old objects are named by SHA-1". There is a way to get a backward-compatible *user experience* without the format change being backward-compatible, though. Name all objects in the repository using FuturisticHash. Also store enough information to recover the old hashes, either in objects as a new field or in a side table. If the old hash is broken, signatures using the old hash cannot be trusted. An adversary could generate a collision to retroactively change the meaning of an existing signature. To maintain the meaning of old signatures, someone has to record the new names of all involved objects, either before the state of the art in breaking the old hash advances far enough or using a copy of the repository from before the state of the art had advanced --- in effect you need new signatures to maintain the meaning of old signatures. This could happen as part of the process of updating a repository to use a new hash. E.g. object a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a sha1tag object 04b871796dc0420f8e7561a895b52484b701d51a type commit tag signedtag tagger C O Mitter <committer@xxxxxxxxxxx> 1465981006 +0000 signed tag signed tag message body -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXYRhOAAoJEGEJLoW3InGJklkIAIcnhL7RwEb/+QeX9enkXhxn rxfdqrvWd1K80sl2TOt8Bg/NYwrUBw/RWJ+sg/hhHp4WtvE1HDGHlkEz3y11Lkuh 8tSxS3qKTxXUGozyPGuE90sJfExhZlW4knIQ1wt/yWqM+33E9pN4hzPqLwyrdods q8FWEqPPUbSJXoMbRPw04S5jrLtZSsUWbRYjmJCHzlhSfFWW4eFd37uquIaLUBS0 rkC3Jrx7420jkIpgFcTI2s60uhSQLzgcCwdA2ukSYIRnjg/zDkj8+3h/GaROJ72x lZyI6HWixKJkWw8lE9aAOD9TmTW9sFJwcVAzmAuFX2kUreDUKMZduGcoRYGpD7E= =jpXa -----END PGP SIGNATURE----- -----BEGIN PGP SIGNATURE---- ... -----END PGP SIGNATURE This example uses a signature to attest that mapping 04b871796dc0420f8e7561a895b52484b701d51a->a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a is correct. A more straightforward approach would be for the conversion process to produce an out-of-band signed mapping list to make the sha1tag usable without such a signature. Summary: * Git's properties depend on using a single hash function throughout a repository. I don't think we should change that. * A safe and mostly painless migration to a stronger hash function is possible using a signed assertion (for example generated by the conversion process) of the mapping from old object names to new object names. * Dealing with multiple such signed mappings (for example due to separate conversion of repositories based on linux.git) is left as an exercise to the reader. Hope that helps, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html