Re: [PATCH 11/22] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use feature number instead of mask

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 09, 2019 at 12:47:33PM +0100, Sebastian Andrzej Siewior wrote:
> After changing the argument of __raw_xsave_addr() from a mask to number
> Dave suggested to check if it makes sense to do the same for
> get_xsave_addr(). As it turns out it does. Only get_xsave_addr() needs
> the mask to check if the requested feature is part of what is
> support/saved and then uses the number again. The shift operation is
> cheaper compared to "find last bit set". Also, the feature number uses
> less opcode space compared to the mask :)
> 
> Make get_xsave_addr() argument a xfeature number instead of mask and fix
> up its callers.
> As part of this use xfeature_nr and xfeature_mask consistently.

Good!

> This results in changes to the kvm code as:
> 	feature -> xfeature_mask
> 	index -> xfeature_nr
> 
> Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> ---
>  arch/x86/include/asm/fpu/xstate.h |  4 ++--
>  arch/x86/kernel/fpu/xstate.c      | 23 +++++++++++------------
>  arch/x86/kernel/traps.c           |  2 +-
>  arch/x86/kvm/x86.c                | 28 ++++++++++++++--------------
>  arch/x86/mm/mpx.c                 |  6 +++---
>  5 files changed, 31 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> index 48581988d78c7..fbe41f808e5d8 100644
> --- a/arch/x86/include/asm/fpu/xstate.h
> +++ b/arch/x86/include/asm/fpu/xstate.h
> @@ -46,8 +46,8 @@ extern void __init update_regset_xstate_info(unsigned int size,
>  					     u64 xstate_mask);
>  
>  void fpu__xstate_clear_all_cpu_caps(void);
> -void *get_xsave_addr(struct xregs_state *xsave, int xstate);
> -const void *get_xsave_field_ptr(int xstate_field);
> +void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
> +const void *get_xsave_field_ptr(int xfeature_nr);
>  int using_compacted_format(void);
>  int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
>  int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 0e759a032c1c7..d288e4d271b71 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -830,15 +830,15 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
>   *
>   * Inputs:
>   *	xstate: the thread's storage area for all FPU data
> - *	xstate_feature: state which is defined in xsave.h (e.g.
> - *	XFEATURE_MASK_FP, XFEATURE_MASK_SSE, etc...)
> + *	xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
> + *	XFEATURE_SSE, etc...)
>   * Output:
>   *	address of the state in the xsave area, or NULL if the
>   *	field is not present in the xsave buffer.
>   */
> -void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
> +void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
>  {
> -	int xfeature_nr;
> +	u64 xfeature_mask = 1ULL << xfeature_nr;

You can paste directly BIT_ULL(xfeature_nr) where you need it in this
function...

>  	/*
>  	 * Do we even *have* xsave state?
>  	 */
> @@ -850,11 +850,11 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
>  	 * have not enabled.  Remember that pcntxt_mask is
>  	 * what we write to the XCR0 register.
>  	 */
> -	WARN_ONCE(!(xfeatures_mask & xstate_feature),
> +	WARN_ONCE(!(xfeatures_mask & xfeature_mask),

... and turn this into:

	WARN_ONCE(!(xfeatures_mask & BIT_ULL(xfeature_nr))

which is more readable than the AND of two variables which I had to
re-focus my eyes to see the difference. :)

Oh and this way, gcc generates better code by doing simply a BT
directly:

# arch/x86/kernel/fpu/xstate.c:852:     WARN_ONCE(!(xfeatures_mask & BIT_ULL(xfeature_nr)),
        .loc 1 852 2 view .LVU258
        movq    xfeatures_mask(%rip), %rax      # xfeatures_mask, tmp124
        btq     %rsi, %rax      # xfeature_nr, tmp124


without first computing the shift into xfeature_mask:

# arch/x86/kernel/fpu/xstate.c:841:     u64 xfeature_mask = 1ULL << xfeature_nr;
        .loc 1 841 6 view .LVU249
        movl    %esi, %ecx      # xfeature_nr, tmp120
        movl    $1, %ebp        #, tmp105
        salq    %cl, %rbp       # tmp120, xfeature_mask

and then testing it:

# arch/x86/kernel/fpu/xstate.c:853:     WARN_ONCE(!(xfeatures_mask & xfeature_mask),
        .loc 1 853 2 view .LVU256
        testq   %rbp, xfeatures_mask(%rip)      # xfeature_mask, xfeatures_mask
        movq    %rdi, %rbx      # xsave, xsave


Otherwise a nice cleanup!

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux