Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

Dan Williams <dan.j.williams@xxxxxxxxx> · Sat, 6 Jan 2018 10:29:49 -0800

On Sat, Jan 6, 2018 at 10:13 AM, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
> On Sat, Jan 06, 2018 at 12:32:42PM +0000, Alan Cox wrote:
>> On Fri, 5 Jan 2018 18:52:07 -0800
>> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> > On Fri, Jan 5, 2018 at 5:10 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> > > From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>> > >
>> > > When access_ok fails we should always stop speculating.
>> > > Add the required barriers to the x86 access_ok macro.
>> >
>> > Honestly, this seems completely bogus.
>>
>> Also for x86-64 if we are trusting that an AND with a constant won't get
>> speculated into something else surely we can just and the address with ~(1
>> << 63) before copying from/to user space ? The user will then just
>> speculatively steal their own memory.
>
> +1
>
> Any type of straight line code can address variant 1.
> Like changing:
>   array[index]
> into
>   array[index & mask]
> works even when 'mask' is a variable.
> To proceed with speculative load from array cpu has to speculatively
> load 'mask' from memory and speculatively do '&' alu.
> If attacker cannot influence 'mask' the speculative value of it
> will bound 'index & mask' value to be within array limits.
>
> I think "lets sprinkle lfence everywhere" approach is going to
> cause serious performance degradation. Yet people pushing for lfence
> didn't present any numbers.
> Last time lfence was removed from the networking drivers via dma_rmb()
> packet-per-second metric jumped 10-30%. lfence forces all outstanding loads
> to complete. If any prior load is waiting on L3 or memory,
> lfence will cause 100+ ns stall and overall kernel performance will tank.

You are conflating dma_rmb() with the limited cases where
nospec_array_ptr() is used. I need help determining what the
performance impact of those limited places are.

> If kernel adopts this "lfence everywhere" approach it will be
> the end of the kernel as we know it. All high performance operations
> will move into user space. Networking and IO will be first.
> Since it will takes years to design new cpus and even longer
> to upgrade all servers the industry will have no choice,
> but to move as much logic as possible from the kernel.
>
> kpti already made crossing user/kernel boundary slower, but
> kernel itself is still fast. If kernel will have lfence everywhere
> the kernel itself will be slow.
>
> In that sense retpolining the kernel is not as horrible as it sounds,
> since both user space and kernel has to be retpolined.

retpoline is variant-2, this patch series is about variant-1.