On Fri, Sep 27, 2019 at 2:58 AM Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxx> wrote: > > > I'd want to see wireguard in an end-to-end situation from the very > > client hardware. So laptops, phones, desktops. Not the untrusted (to > > me) hw in between. > > > I don't see why the crypto HW would deserve any less trust than, say, > the CPU itself. I would say CPU's don't deserve that trust at the moment. It's not the crypto engine that is part of the untrusted hardware. It's the box itself, and the manufacturer, and you having to trust that the manufacturer didn't set up some magic knocking sequence to disable the encryption. Maybe the company that makes them is trying to do a good job. But maybe they are based in a country that has laws that require backdoors. Say, France. There's a long long history of that kind of thing. It's all to "fight terrorism", but hey, a little industrial espionage is good too, isn't it? So let's just disable GSM encryption based on geographic locale and local regulation, shall we. Yeah, yeah, GSM encryption wasn't all that strong to begin with, but it was apparently strong enough that France didn't want it. So tell me again why I should trust that box that I have no control over? > Well, that's the general idea of abstraction. It also allows for > swapping in any other cipher with minimal effort just _because_ the > details were hidden from the application. So it may cost you some > effort initially, but it may save you effort later. We clearly disagree on the utility of crypto agility. You point to things like ipsec as an argument for it. And I point to ipsec as an argument *against* that horror. It's a bloated, inefficient, horribly complex mess. And all the "agility" is very much part of it. I also point to GSM as a reason against "agility". It has caused way more security problems than it has ever solved. The ":agility" is often a way to turn off (or tune down) the encryption, not as a way to say "ok, we can improve it later". That "we can improve it later" is a bedtime story. It's not how it gets used. Particularly as the weaknesses are often not primarily in the crypto algorithm itself, but in how it gets used or other session details. When you actually want to *improve* security, you throw the old code away, and start a new protocol entirely. Eg SSL -> TLS. So cryptographic agility is way oversold, and often people are actively lying about why they want it. And the people who aren't lying are ignoring the costs. One of the reasons _I_ like wireguard is that it just went for simple and secure. No BS. And you say > Especially since all crypto it uses comes from a single > source (DJB), which is frowned upon in the industry. I'm perhaps not a fan of DJB in all respects, but there's no question that he's at least competent. The "industry practice" of having committees influenced by who knows what isn't all that much better. Do you want to talk about NSA elliptic curve constant choices? Anyway, on the costs: > > - dynamically allocate buffers at "init time" > > Why is that so "wrong"? It sure beats doing allocations on the hot path. It's wrong not becasue the allocation is costly (you do that only once), but because the dynamic allocation means that you can't embed stuff in your own native data structures as a user. So now accessing those things is no longer dense in the cache. And it's the cache that matters for a synchronous CPU algorithm. You don't want the keys and state to be in some other location when you already have your data structures for the stream that could just have them right there with the other data. > And you don't want to have it on the stack initially and then have > to _copy_ it to some DMA-able location that you allocate on the fly > on the hot path if you _do_ want HW acceleration. Actually, that's *exactly* what you want. You want keys etc to be in regular memory in a location that is convenient to the user, and then only if the hardware has issues do you say "ok, copy the key to the hardware". Because quite often the hardware will have very special key caches that aren't even available to the CPU, because they are on some hw-private buffers. Yes, you want to have a "key identity" model so that the hardware doesn't have to reload it all the time, but that's an invalidation protocol, not a "put the keys or nonces in special places". Linus