On Mon, Jun 11, 2018 at 4:27 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > > > And no, I'm not a cryptographer. But honestly, length extension > > attacks were how both md5 and sha1 were broken in practice, so I'm > > just going "why would we go with a crypto choice that has that known > > weakness? That's just crazy". > > What do you think about Johannes's summary of this being a non-issue for > Git in > https://public-inbox.org/git/alpine.DEB.2.21.1.1706151122180.4200@virtualbox/ > ? I agree that the fact that git internal data is structured and all meaningful (and doesn't really have ignored state) makes it *much* harder to attack the basic git objects, since you not only have to generate a good hash, the end result has to also *parse* and there is not really any hidden non-parsed data that you can use to hide the attack. And *if* you are using git for source code, the same is pretty much true even for the blob objects - an attacking object will stand out like a sore thumb in "diff" etc. So I don't disagree with Johannes in that sense: I think git does fundamentally tend to have some extra validation in place, and there's a reason why the examples for both the md5 and the sha1 attack were pdf files. That said, even if git internal ("metadata") objects like trees and commits tend to not have opaque parts to them and are thus pretty hard to attack, the blob objects are still an attack vector for projects that use git for non-source-code (and even source projects do embed binary files - including pdf files - even though they might not be "as interesting" to attack). So you do want to protect those too. And hey, protecting the metadata objects is good just to protect against annoyances. Sure, you should always sanity check the object at receive time anyway, but even so, if somebody is able to generate a blob object that hashes to the same hash as a metadata object (ie tree or commit), that really could be pretty damn annoying. And the whole "intermediate hashed state is same size as final hash state" just _fundamentally_ means that if you find a weakness in the hash, you can now attack that weakness without having to worry about the attack being fundamentally more expensive. That's essentially what SHAttered relied on. It didn't rely on a secret and a hash and length extension, but it *did* rely on the same mechanism that a length extension attack relies on, where you can basically attack the state in the middle with no extra cost. Maybe some people don't consider it a length extension attack for that reason, but it boils down to much the same basic situation where you can attack the internal hash state and cause a state collision. And you can try to find the patterns that then cause that state collision when you've found a weakness in the hash. With SHA3 or k12, you can obviously _also_ try to attack the hash state and cause a collision, but because the intermediate state is much bigger than the final hash, you're just making things *way* harder for yourself if you try that. Linus