Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 8 Jan 2018 15:44:55 -0800

On Mon, Jan 8, 2018 at 1:09 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> On Sat, Jan 6, 2018 at 5:20 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Sat, Jan 6, 2018 at 3:31 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>>>
>>> I assume if we put this in uaccess_begin() we also need audit for
>>> paths that use access_ok but don't do on to call uaccess_begin()? A
>>> quick glance shows a few places where we are open coding the stac().
>>> Perhaps land the lfence in stac() directly?
>>
>> Yeah, we should put it in uaccess_begin(), and in the actual user
>> accessor helpers that do stac. Some of them probably should be changed
>> to use uaccess_begin() instead while at it.
>>
>> One question for the CPU people: do we actually care and need to do
>> this for things that might *write* to something? The speculative write
>> obviously is killed, but does it perhaps bring in a cacheline even
>> when killed?
>
> As far as I understand a write could trigger a request-for-ownership
> read for the target cacheline.

Oh, absolutely.

I just wonder at what point that happens.

Honestly, trying to get exclusive access to a cacheline can be _very_
expensive (not just for the local thread), so I would actually expect
that doing so for speculative writes is actually bad for performance.

That's doubly true because - unlike reads - there is no critical
latency issue, so trying to get the cache access started as early as
possible simply isn't all that important.

So I suspect that a write won't actually try to allocate the cacheline
until the write has actually retired.

End result: writes - unlike reads - *probably* will not speculatively
perturb the cache with speculative write addresses.

> Even though writes can trigger reads, as far as I can see the write
> needs to be dependent on the first out-of-bounds read

Yeah. A write on its own wouldn't matter, even if it were to perturb
the cache state, because the address already comes from user space, so
there's no new information in the cache perturbation for the attacker.

But that all implies that we shouldn't need the lfence for the
"put_user()" case, only for the get_user() (where the value we read
would then perhaps be used to do another access).

So we want to add the lfence (or "and") to get_user(), but not
necessarily put_user().

Agreed?

                  Linus