Dear List Members, Git Developers, I would like to discuss an old topic from 2006. I understand it was already discussed. The only reason i'm sending this e-mail is to talk about a possible solution which didn't show up on this list before. I think we all understand that SHA-1 is broken. It still works perfect as a storage key, but it's not cryptographically secure anymore. Git is not moving away from SHA-1 because it would break too many projects, and cryptographic security is not needed but git if you have your own repository. However I would like to show some big problems caused by SHA-1: - Git signed tags and signed commits are cryptographically insecure, they're useless at the moment. - Git Torrent (https://github.com/cjb/GitTorrent) is also cryptographically broken, however it would be an awesome experiment. - Linus said: "You only need to know the SHA-1 of the top of your tree, and if you know that, you can trust your tree." That's not true anymore. You have to trust your computer, you servers, your git provider in a way that no-one can maliciously modify your data. I understand that git is perfect for a work flow, where you have your very own repository and you double-check any commits or diffs you accepting to it. But that's not everybody's work flow. For example: if I want to blindly trust my college, I could just include all commits he signed without review. Currently I can't do that. There are workarounds of course: signing the e-mail he sends me, or signing the entire git repository's tarball, etc... But that's not the right way to do things. As a final thought on this, I would like to say: Git is a great tool, but it can be so much better with a safe hash. I would like to propose a solution for changing git's hash algorithm: It would be a breaking change, bit I think it can be done pretty painless. (If you read the discussion back in 2006 the problems of moving are clear.) In git, every data has to have one and only one key - so a hybrid hash is a no-go. That means changing hash algo involves re-hashing every data in a git repository, but it's not that bad. On a git clone, we actually re-hash everything to check integrity. Changing all the keys shouldn't be worth than that. But - and that's the main idea i'm writing here - changing the storage keys does not mean you should drop your old hashes out. If you change the git data structure in a way, that it can keep multiple hashes for the same "link" in each objects (trees, commits, etc) you can keep the old ones right next to the new one. If you want to look up the referenced object, you must use the newest hash - which is the key. But if you want to verify some old hash, it's still possible! Just look up the objects by the new key, remove all the newer generation keys, and verify the old hash on that. A storage structure like this would allow a very great flexibility: - You can change your hash algorithm in the future. If SHA-256 becomes broken, it's not a problem. Just re-hash the storage, and append the new hashes the git objects. - You can still verify your old hashes after a hash change - removing the new hashes from the objects before hashing should give you back the old objects, thus giving you the same hash as before. - That makes possible for signed tags, and commits to keep their validity after hash change! With a clever-enough new format, you can even keep the validity of current hashes and signs. (To be able to do that, you should be able to calculate back the current format from the new format.) Moving git forward to a format like this would solve the weak-key problem in git forever. You would be able to configure your key algo on a per repository basis, you - and git - can do the daily work on the newest hashes, while still carrying the old hashes and signatures, in case you ever want to verify them. That would allow repositories to gracefully change hashes in case they need to, and to only compatibility limitation is that you must use a new enough git to understand the new storage format. What are your thoughts on this approach? Will git ever reach a release with exchangeable hash algorithm? Or should someone look for alternatives if there's a need for cryptographic security? Thank you for your time reading this. References: SHA-256 discussion in 2006: http://www.gelato.unsw.edu.au/archives/git/0608/26446.html Discussion about git signatures in 2014 https://www.mail-archive.com/git%40vger.kernel.org/msg61087.html Linus's talk on git https://www.youtube.com/watch?v=4XpnKHJAok8&t=56m20s Kind regards, Zsolt Herczeg -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html