On Mon, Jan 8, 2018 at 3:44 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, Jan 8, 2018 at 1:09 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >> On Sat, Jan 6, 2018 at 5:20 PM, Linus Torvalds >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>> On Sat, Jan 6, 2018 at 3:31 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >>>> >>>> I assume if we put this in uaccess_begin() we also need audit for >>>> paths that use access_ok but don't do on to call uaccess_begin()? A >>>> quick glance shows a few places where we are open coding the stac(). >>>> Perhaps land the lfence in stac() directly? >>> >>> Yeah, we should put it in uaccess_begin(), and in the actual user >>> accessor helpers that do stac. Some of them probably should be changed >>> to use uaccess_begin() instead while at it. >>> >>> One question for the CPU people: do we actually care and need to do >>> this for things that might *write* to something? The speculative write >>> obviously is killed, but does it perhaps bring in a cacheline even >>> when killed? >> >> As far as I understand a write could trigger a request-for-ownership >> read for the target cacheline. > > Oh, absolutely. > > I just wonder at what point that happens. > > Honestly, trying to get exclusive access to a cacheline can be _very_ > expensive (not just for the local thread), so I would actually expect > that doing so for speculative writes is actually bad for performance. > > That's doubly true because - unlike reads - there is no critical > latency issue, so trying to get the cache access started as early as > possible simply isn't all that important. > > So I suspect that a write won't actually try to allocate the cacheline > until the write has actually retired. > > End result: writes - unlike reads - *probably* will not speculatively > perturb the cache with speculative write addresses. > >> Even though writes can trigger reads, as far as I can see the write >> needs to be dependent on the first out-of-bounds read > > Yeah. A write on its own wouldn't matter, even if it were to perturb > the cache state, because the address already comes from user space, so > there's no new information in the cache perturbation for the attacker. > > But that all implies that we shouldn't need the lfence for the > "put_user()" case, only for the get_user() (where the value we read > would then perhaps be used to do another access). > > So we want to add the lfence (or "and") to get_user(), but not > necessarily put_user(). Yes, perhaps __uaccess_begin_get() and __uaccess_begin_put() to keep things separate? > Agreed? I've been thinking the "and" is only suitable for the array bounds check, for get_user() we're trying to block speculation past access_ok() at which point we can only do the lfence?