On Tue, Sep 18, 2018 at 8:18 AM Joan Daemen <jda@xxxxxxxxxxx> wrote: > > 3) The relatively large state in the sponge construction increases the generic strength against attacks when the input contains redundancy or > has a certain form. For instance, if the input is restricted to be text in ASCII (such as source code), then the collision-resistance grows > higher than the nominal 2^{c/2}. Such an effect does not exist with narrow-pipe Merkle-Damgård. (This may be what Linus had intuitively in mind.) Answering to just this part: No, what I had in mind was literally just exactly the kind of attack that SHA1 broke for - attacking the internal state vector directly, and not paying any penalty for it, because the stat size is the same as the final hash size. The length extension attack is just the simplest and most trivial version of that kind of attack - because the internal state vector *is* the result, and you just continue using it. But that trivial length extension thing not the real problem, it's just the absolutely simplest symptom of the real problem. I think that the model where the internal state of the hash is the same width as the final result is simply broken. It was what broke SHA1, and that problem is shared with SHA2. "Length extension" is just the simplest way to say "broken by design", imho. Because the length extension attack is just the most trivial attack, but it isn't the fundamental problem. It was just the first and the cheapest attack found, but it was also the most special-cased and least interesting. You need to have a very special case (with that secret at the beginning etc) to make the pure length extension attack interesting. And git has no secrets, so in that sense "length extension" by itself is totally immaterial. But the basic problem of internal hash size obviously wasn't. So I would say that length extension is a direct result of the _real_ problem, which is that the hash exposes _all_ of the internal data. That is what makes length extension possible - because you can just continue from a known state, and there is absolutely nothing hidden - and yes, that's a really easy special case where you don't even need to actually break the hash at all. But I argue that it's _also_ one big part of what made SHAttered practical, and I think the underlying problem is exactly the same. When the internal state is the same size as the hash, you can attack the internal state itself for basically the same cost as attacking the whole hash. So you can pick-and-choose the weakest point. Which is basically exactly what SHAttered did. No, it wasn't the trivial "just add to the end", but it used the exact same underlying weakness as one part of the attack. *This* is why I dislike SHA2. It has basically the exact same basic weakness that we already know SHA1 fell for. The hashing details are different, and hopefully that means that there aren't the same kind of patterns that can be generated to do the "attack the internal hash state" part, but I don't understand why people seem to ignore that other fundamental issue. Something like SHA-512/256 would have been better, but I think almost nobody does that in hardware, which was one of the big advantages of plain SHA2. The main reason I think SHA2 is acceptable is simply that 256 bits is a lot. So even if somebody comes up with a shortcut that weakens it by tens of bits, nobody really cares. Plus I'm obviously not a cryptographer, so I didn't feel like I was going to fight it a lot. But yes, I'd have probably gone with any of the other alternatives, because I think it's a bit silly that we're switching hashes to another hash that has (at least in part) the *exact* same issue as the one people call broken. (And yes, the hashing details are different, so it's "exactly the same" only wrt that internal state part - not the bitpattern finding part that made the attack on the internal state much cheaper. Real cryptographers obviously found that "figure out the weakness of the hashing" to be the more interesting and novel part over the trivial internal hash size part). That said.. The real reason I think SHA2 is the right choice was simply that there needs to be a decision, and none of the choices were *wrong*. Sometimes just the _act_ of making a decision is more important than _what_ the decision is. And hey, it is also likely that the reason _I_ get hung up on just the size of the internal state is that exactly because I am _not_ a cryptographer, that kind of high-level stuff is the part I understand. When you start talking about why the exact rules of Merkle–Damgård constructions work, my eyes just glaze over. So I'm probably - no, certainly - myopic and looking at only one part of the issue to begin with. The end result is that I argued for more bits in the internal state (and apparently wide vs narrow is the technical term), and I would have seen parallel algorithms as a bonus for the large-file case. None of which argued for SHA2. But see above on why I think SHA2 is if not *the* right choice, at least *a* right choice. Linus