Re: [PATCH] x86/fpu: Correct pkru/xstate inconsistency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 15, 2022 at 07:36:44AM -0800, Brian Geffon wrote:
> There are two issues with PKRU handling prior to 5.13. The first is that
> when eagerly switching PKRU we check that current is not a kernel
> thread as kernel threads will never use PKRU. It's possible that
> this_cpu_read_stable() on current_task (ie. get_current()) is returning
> an old cached value. By forcing the read with this_cpu_read() the
> correct task is used. Without this it's possible when switching from
> a kernel thread to a userspace thread that we'll still observe the
> PF_KTHREAD flag and never restore the PKRU. And as a result this
> issue only occurs when switching from a kernel thread to a userspace
> thread, switching from a non kernel thread works perfectly fine because
> all we consider in that situation is the flags from some other non
> kernel task and the next fpu is passed in to switch_fpu_finish().
> 
> Without reloading the value finish_fpu_load() after being inlined into
> __switch_to() uses a stale value of current:
> 
>   ba1:   8b 35 00 00 00 00       mov    0x0(%rip),%esi
>   ba7:   f0 41 80 4d 01 40       lock orb $0x40,0x1(%r13)
>   bad:   e9 00 00 00 00          jmp    bb2 <__switch_to+0x1eb>
>   bb2:   41 f6 45 3e 20          testb  $0x20,0x3e(%r13)
>   bb7:   75 1c                   jne    bd5 <__switch_to+0x20e>
> 
> By using this_cpu_read() and avoiding the cached value the compiler does
> insert an additional load instruction and observes the correct value now:
> 
>   ba1:   8b 35 00 00 00 00       mov    0x0(%rip),%esi
>   ba7:   f0 41 80 4d 01 40       lock orb $0x40,0x1(%r13)
>   bad:   e9 00 00 00 00          jmp    bb2 <__switch_to+0x1eb>
>   bb2:   65 48 8b 05 00 00 00    mov    %gs:0x0(%rip),%rax
>   bb9:   00
>   bba:   f6 40 3e 20             testb  $0x20,0x3e(%rax)
>   bbe:   75 1c                   jne    bdc <__switch_to+0x215>
> 
> The second issue is when using write_pkru() we only write to the
> xstate when the feature bit is set because get_xsave_addr() returns
> NULL when the feature bit is not set. This is problematic as the CPU
> is free to clear the feature bit when it observes the xstate in the
> init state, this behavior seems to be documented a few places throughout
> the kernel. If the bit was cleared then in write_pkru() we would happily
> write to PKRU without ever updating the xstate, and the FPU restore on
> return to userspace would load the old value agian.
> 
> Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
> Signed-off-by: Brian Geffon <bgeffon@xxxxxxxxxx>
> Signed-off-by: Willis Kung <williskung@xxxxxxxxxx>
> Tested-by: Willis Kung <williskung@xxxxxxxxxx>
> ---
>  arch/x86/include/asm/fpu/internal.h |  2 +-
>  arch/x86/include/asm/pgtable.h      | 14 ++++++++++----
>  2 files changed, 11 insertions(+), 5 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux