Re: [PATCH] x86: combine memmove FSRM and ERMS alternatives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 13, 2023 at 12:32:17PM -0800, Daniel Verkamp wrote:
> The x86-64 memmove code has two ALTERNATIVE statements in a row, one to
> handle FSRM ("Fast Short REP MOVSB"), and one to handle ERMS ("Enhanced
> REP MOVSB"). If either of these features is present, the goal is to jump
> directly to a REP MOVSB; otherwise, some setup code that handles short
> lengths is executed. The first comparison of a sequence of specific
> small sizes is included in the first ALTERNATIVE, so it will be replaced
> by NOPs if FSRM is set, and then (assuming ERMS is also set) execution
> will fall through to the JMP to a REP MOVSB in the next ALTERNATIVE.
> 
> The two ALTERNATIVE invocations can be combined into a single instance
> of ALTERNATIVE_2 to simplify and slightly shorten the code. If either
> FSRM or ERMS is set, the first instruction in the memmove_begin_forward
> path will be replaced with a jump to the REP MOVSB.
> 
> This also prevents a problem when FSRM is set but ERMS is not; in this
> case, the previous code would have replaced both ALTERNATIVEs with NOPs
> and skipped the first check for sizes less than 0x20 bytes. This
> combination of CPU features is arguably a firmware bug, but this patch
> makes the function robust against this badness.
> 
> Fixes: f444a5ff95dc ("x86/cpufeatures: Add support for fast short REP; MOVSB")
> Signed-off-by: Daniel Verkamp <dverkamp@xxxxxxxxxxxx>
> ---
>  arch/x86/lib/memmove_64.S | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
> index 724bbf83eb5b..1fc36dbd3bdc 100644
> --- a/arch/x86/lib/memmove_64.S
> +++ b/arch/x86/lib/memmove_64.S
> @@ -38,8 +38,10 @@ SYM_FUNC_START(__memmove)
>  
>  	/* FSRM implies ERMS => no length checks, do the copy directly */
>  .Lmemmove_begin_forward:
> -	ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
> -	ALTERNATIVE "", "jmp .Lmemmove_erms", X86_FEATURE_ERMS
> +	ALTERNATIVE_2 \
> +		"cmp $0x20, %rdx; jb 1f", \
> +		"jmp .Lmemmove_erms", X86_FEATURE_FSRM, \
> +		"jmp .Lmemmove_erms", X86_FEATURE_ERMS
>  
>  	/*
>  	 * movsq instruction have many startup latency
> -- 
> 2.39.0.314.g84b9a713c41-goog
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux