Re: [RFC v2 1/2] x86/fpu: Fix state corruption in __fpu__restore_sig()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/30/21 3:02 PM, Thomas Gleixner wrote:
> Andy,
> 
> On Sat, May 29 2021 at 22:12, Andy Lutomirski wrote:
>>
>> Cc: stable@xxxxxxxxxxxxxxx
>> Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates")
>> Reported-by: syzbot+2067e764dbcd10721e2e@xxxxxxxxxxxxxxxxxxxxxxxxx
> 
> Debugged-by ...
> 
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> 
> ...
> 
>>  /*
>> - * Clear the FPU state back to init state.
>> - *
>> - * Called by sys_execve(), by the signal handler code and by various
>> - * error paths.
>> + * Reset current's user FPU states to the init states.  The caller promises
>> + * that current's supervisor states (in memory or CPU regs as appropriate)
>> + * as well as the XSAVE header in memory are intact.

^^^ The caller promises this

>>   */
>> -static void fpu__clear(struct fpu *fpu, bool user_only)
>> +void fpu__clear_user_states(struct fpu *fpu)
>>  {
>>  	WARN_ON_FPU(fpu != &current->thread.fpu);
>>  

...

>> +	/*
>> +	 * Reset user states in registers.
>> +	 */
>> +	copy_init_fpstate_to_fpregs(xfeatures_mask_user());
>> +
>> +	/*
>> +	 * Now all FPU registers have their desired values.  Inform the
>> +	 * FPU state machine that current's FPU registers are in the
>> +	 * hardware registers.
>> +	 */
>>  	fpregs_mark_activate();
>> +
>>  	fpregs_unlock();
> 
> This is as wrong as before. The corrupted task->fpu.state still
> survives.

See above.  This function is not intended to fix the header.

> 
> For f*cks sake, I gave you a reproducer and a working patch and I
> explained it in great length what's broken and what needs to be fixed.

This patch fixes your reproducer and my (to-be-sent) reproducer.  I
tested it on a machine that physically has the XRSTORS instruction but I
disabled it using virt.  Are you still seeing failures with this patch
applied?  I can try to test on a different CPU.

> 
> And of course you kept the bug which was in the offending commit,
> i.e. not wiping the task->fpu.state corruption which causes the next
> XRSTOR to fail:
> 
> [   34.095020] Bad FPU state detected at copy_kernel_to_fpregs+0x28/0x40, reinitializing FPU registers.
> [   34.095052] WARNING: CPU: 0 PID: 1364 at arch/x86/mm/extable.c:65 ex_handler_fprestore+0x5f/0x70
> ...
> [   34.153472]  switch_fpu_return+0x40/0xb0
> [   34.154196]  exit_to_user_mode_prepare+0x8f/0x180
> [   34.155060]  syscall_exit_to_user_mode+0x23/0x50
> [   34.155912]  do_syscall_64+0x4d/0xb0

I don't think that the code path that calls fpu__clear_user_states() is
subject to header corruption.  If it is, then both your patch and my
patch have issues -- trying to fish the supervisor state out of a
corrupt XSTATE buffer is asking for trouble.

> 
> IOW, this is exactly the same shit as we had before. So what is decent
> about this? Define decent...
> 
> Why the heck do you think I wasted a couple of days to:
> 
>  - Analyze the root cause
> 
>  - Destill a trivial C reproducer
> 
>  - Come up with a fully working and completely correct fix
> 
> Just because, right?
> 
> I'm fine with splitting up clear_all() and clear_user(), but what you
> provided is as much of a clusterfuck as the commit it pretends to fix.
> 
> Your's seriously grumpy
> 
>        Thomas
> 




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux