Re: [RFC][PATCH v2 4/5] x86/uaccess: Implement unsafe_try_cmpxchg_user()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 20, 2022, Peter Zijlstra wrote:
> Do try_cmpxchg() loops on userspace addresses.
> 
> Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
>  arch/x86/include/asm/uaccess.h |   67 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 67 insertions(+)
> 
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -342,6 +342,24 @@ do {									\
>  		     : [umem] "m" (__m(addr))				\
>  		     : : label)
>  
> +#define __try_cmpxchg_user_asm(itype, ltype, _ptr, _pold, _new, label)	({ \
> +	bool success;							\
> +	__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);		\
> +	__typeof__(*(_ptr)) __old = *_old;				\
> +	__typeof__(*(_ptr)) __new = (_new);				\
> +	asm_volatile_goto("\n"						\
> +		     "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\
> +		     _ASM_EXTABLE_UA(1b, %l[label])			\
> +		     : CC_OUT(z) (success),				\
> +		       [ptr] "+m" (*_ptr),				\
> +		       [old] "+a" (__old)				\
> +		     : [new] ltype (__new)				\
> +		     : "memory", "cc"					\

IIUC, the "cc" clobber is unnecessary as CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y implies
__GCC_ASM_FLAG_OUTPUTS__=y, i.e. CC_OUT() will resolve to "=@cc".

> +		     : label);						\
> +	if (unlikely(!success))						\
> +		*_old = __old;						\
> +	likely(success);					})
> +
>  #else // !CONFIG_CC_HAS_ASM_GOTO_OUTPUT

...

> +extern void __try_cmpxchg_user_wrong_size(void);
> +
> +#define unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label) ({		\
> +	__typeof__(*(_ptr)) __ret;					\

This should probably be a bool, the return from the lower level helpers is a bool
that's true if the exchange succeed.  Declaring the type of the target implies
that they return the raw result, which is confusing.

> +	switch (sizeof(__ret)) {					\
> +	case 1:	__ret = __try_cmpxchg_user_asm("b", "q",		\
> +					       (_ptr), (_oldp),		\
> +					       (_nval), _label);	\
> +		break;							\
> +	case 2:	__ret = __try_cmpxchg_user_asm("w", "r",		\
> +					       (_ptr), (_oldp),		\
> +					       (_nval), _label);	\
> +		break;							\
> +	case 4:	__ret = __try_cmpxchg_user_asm("l", "r",		\
> +					       (_ptr), (_oldp),		\
> +					       (_nval), _label);	\
> +		break;							\
> +	case 8:	__ret = __try_cmpxchg_user_asm("q", "r",		\
> +					       (_ptr), (_oldp),		\
> +					       (_nval), _label);	\

Doh, I should have specified that KVM needs 8-byte CMPXCHG on 32-bit kernels due
to using it to atomically update guest PAE PTEs and LTR descriptors (yay).

Also, KVM's use case isn't a tight loop, how gross would it be to add a slightly
less unsafe version that does __uaccess_begin_nospec()?  KVM pre-checks the address
way ahead of time, so the access_ok() check can be omitted.  Alternatively, KVM
could add its own macro, but that seems a little silly.  E.g. somethign like this,
though I don't think this is correct (something is getting inverted somewhere and
the assembly output is a nightmare):

/* "Returns" 0 on success, 1 on failure, -EFAULT if the access faults. */
#define ___try_cmpxchg_user(_ptr, _oldp, _nval, _label)	({		\
	int ____ret = -EFAULT;						\
	__uaccess_begin_nospec();					\
	____ret = !unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label);	\
_label:									\
	__uaccess_end();						\
	____ret;							\
						})

Lastly, assuming I get my crap working, mind if I post a variant (Cc'd to stable@) in
the context of KVM series?  Turns out KVM has an ugly bug where it completely
botches the pfn calculation of memory it remaps and accesses[*], the easiest fix
is to switch to __try_cmpxchg_user() and purge the nastiness.

[*] https://lore.kernel.org/all/20220124172633.103323-1-tadeusz.struk@xxxxxxxxxx




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux