Git and SHA-1 security (again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear List Members, Git Developers,

I would like to discuss an old topic from 2006. I understand it was
already discussed. The only reason i'm sending this e-mail is to talk
about a possible solution which didn't show up on this list before.

I think we all understand that SHA-1 is broken. It still works perfect
as a storage key, but it's not cryptographically secure anymore. Git
is not moving away from SHA-1 because it would break too many
projects, and cryptographic security is not needed but git if you have
your own repository.

However I would like to show some big problems caused by SHA-1:
 - Git signed tags and signed commits are cryptographically insecure,
they're useless at the moment.
 - Git Torrent (https://github.com/cjb/GitTorrent) is also
cryptographically broken, however it would be an awesome experiment.
 - Linus said: "You only need to know the SHA-1 of the top of your
tree, and if you know that, you can trust your tree." That's not true
anymore. You have to trust your computer, you servers, your git
provider in a way that no-one can maliciously modify your data.

I understand that git is perfect for a work flow, where you have your
very own repository and you double-check any commits or diffs you
accepting to it. But that's not everybody's work flow. For example: if
I want to blindly trust my college, I could just include all commits
he signed without review. Currently I can't do that. There are
workarounds of course: signing the e-mail he sends me, or signing the
entire git repository's tarball, etc... But that's not the right way
to do things.

As a final thought on this, I would like to say: Git is a great tool,
but it can be so much better with a safe hash.


I would like to propose a solution for changing git's hash algorithm:
It would be a breaking change, bit I think it can be done pretty
painless. (If you read the discussion back in 2006 the problems of
moving are clear.)

In git, every data has to have one and only one key - so a hybrid hash
is a no-go. That means changing hash algo involves re-hashing every
data in a git repository, but it's not that bad. On a git clone, we
actually re-hash everything to check integrity. Changing all the keys
shouldn't be worth than that.

But - and that's the main idea i'm writing here - changing the storage
keys does not mean you should drop your old hashes out. If you change
the git data structure in a way, that it can keep multiple hashes for
the same "link" in each objects (trees, commits, etc) you can keep the
old ones right next to the new one. If you want to look up the
referenced object, you must use the newest hash - which is the key.
But if you want to verify some old hash, it's still possible! Just
look up the objects by the new key, remove all the newer generation
keys, and verify the old hash on that.

A storage structure like this would allow a very great flexibility:
 - You can change your hash algorithm in the future. If SHA-256
becomes broken, it's not a problem. Just re-hash the storage, and
append the new hashes the git objects.
 - You can still verify your old hashes after a hash change - removing
the new hashes from the objects before hashing should give you back
the old objects, thus giving you the same hash as before.
 - That makes possible for signed tags, and commits to keep their
validity after hash change! With a clever-enough new format, you can
even keep the validity of current hashes and signs. (To be able to do
that, you should be able to calculate back the current format from the
new format.)

Moving git forward to a format like this would solve the weak-key
problem in git forever. You would be able to configure your key algo
on a per repository basis, you - and git - can do the daily work on
the newest hashes, while still carrying the old hashes and signatures,
in case you ever want to verify them. That would allow repositories to
gracefully change hashes in case they need to, and to only
compatibility limitation is that you must use a new enough git to
understand the new storage format.

What are your thoughts on this approach? Will git ever reach a release
with exchangeable hash algorithm? Or should someone look for
alternatives if there's a need for cryptographic security?

Thank you for your time reading this.

References:
SHA-256 discussion in 2006:
http://www.gelato.unsw.edu.au/archives/git/0608/26446.html
Discussion about git signatures in 2014
https://www.mail-archive.com/git%40vger.kernel.org/msg61087.html
Linus's talk on git
https://www.youtube.com/watch?v=4XpnKHJAok8&t=56m20s

Kind regards,
Zsolt Herczeg
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]