Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

Dan Williams <dan.j.williams@xxxxxxxxx> · Mon, 8 Jan 2018 15:53:17 -0800

On Mon, Jan 8, 2018 at 3:44 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Jan 8, 2018 at 1:09 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> On Sat, Jan 6, 2018 at 5:20 PM, Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> On Sat, Jan 6, 2018 at 3:31 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>>>>
>>>> I assume if we put this in uaccess_begin() we also need audit for
>>>> paths that use access_ok but don't do on to call uaccess_begin()? A
>>>> quick glance shows a few places where we are open coding the stac().
>>>> Perhaps land the lfence in stac() directly?
>>>
>>> Yeah, we should put it in uaccess_begin(), and in the actual user
>>> accessor helpers that do stac. Some of them probably should be changed
>>> to use uaccess_begin() instead while at it.
>>>
>>> One question for the CPU people: do we actually care and need to do
>>> this for things that might *write* to something? The speculative write
>>> obviously is killed, but does it perhaps bring in a cacheline even
>>> when killed?
>>
>> As far as I understand a write could trigger a request-for-ownership
>> read for the target cacheline.
>
> Oh, absolutely.
>
> I just wonder at what point that happens.
>
> Honestly, trying to get exclusive access to a cacheline can be _very_
> expensive (not just for the local thread), so I would actually expect
> that doing so for speculative writes is actually bad for performance.
>
> That's doubly true because - unlike reads - there is no critical
> latency issue, so trying to get the cache access started as early as
> possible simply isn't all that important.
>
> So I suspect that a write won't actually try to allocate the cacheline
> until the write has actually retired.
>
> End result: writes - unlike reads - *probably* will not speculatively
> perturb the cache with speculative write addresses.
>
>> Even though writes can trigger reads, as far as I can see the write
>> needs to be dependent on the first out-of-bounds read
>
> Yeah. A write on its own wouldn't matter, even if it were to perturb
> the cache state, because the address already comes from user space, so
> there's no new information in the cache perturbation for the attacker.
>
> But that all implies that we shouldn't need the lfence for the
> "put_user()" case, only for the get_user() (where the value we read
> would then perhaps be used to do another access).
>
> So we want to add the lfence (or "and") to get_user(), but not
> necessarily put_user().

Yes, perhaps __uaccess_begin_get() and __uaccess_begin_put() to keep
things separate?

> Agreed?

I've been thinking the "and" is only suitable for the array bounds
check, for get_user() we're trying to block speculation past
access_ok() at which point we can only do the lfence?