This is a note to let you know that I've just added the patch titled ARM: 8990/1: use VFP assembler mnemonics in register load/store macros to the 5.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: arm-8990-1-use-vfp-assembler-mnemonics-in-register-load-store-macros.patch and it can be found in the queue-5.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From foo@baz Thu Jun 30 03:27:07 PM CEST 2022 From: Florian Fainelli <f.fainelli@xxxxxxxxx> Date: Wed, 29 Jun 2022 11:02:18 -0700 Subject: ARM: 8990/1: use VFP assembler mnemonics in register load/store macros To: stable@xxxxxxxxxxxxxxx Cc: Stefan Agner <stefan@xxxxxxxx>, Russell King <rmk+kernel@xxxxxxxxxxxxxxx>, Florian Fainelli <f.fainelli@xxxxxxxxx>, Russell King <linux@xxxxxxxxxxxxxxx>, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, Tony Lindgren <tony@xxxxxxxxxxx>, Hans Ulli Kroll <ulli.kroll@xxxxxxxxxxxxxx>, Ard Biesheuvel <ardb@xxxxxxxxxx>, Nick Desaulniers <ndesaulniers@xxxxxxxxxx>, Nicolas Pitre <nico@xxxxxxxxxxx>, Andre Przywara <andre.przywara@xxxxxxx>, Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>, Catalin Marinas <catalin.marinas@xxxxxxx>, Jian Cai <caij2003@xxxxxxxxx>, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx (moderated list:ARM PORT), linux-kernel@xxxxxxxxxxxxxxx (open list), linux-crypto@xxxxxxxxxxxxxxx (open list:CRYPTO API), linux-omap@xxxxxxxxxxxxxxx (open list:OMAP2+ SUPPORT), clang-built-linux@xxxxxxxxxxxxxxxx (open list:CLANG/LLVM BUILD SUPPORT), Sasha Levin <sashal@xxxxxxxxxx> Message-ID: <20220629180227.3408104-3-f.fainelli@xxxxxxxxx> From: Stefan Agner <stefan@xxxxxxxx> commit ee440336e5ef977c397afdb72cbf9c6b8effc8ea upstream The integrated assembler of Clang 10 and earlier do not allow to access the VFP registers through the coprocessor load/store instructions: <instantiation>:4:6: error: invalid operand for instruction LDC p11, cr0, [r10],#32*4 @ FLDMIAD r10!, {d0-d15} ^ This has been addressed with Clang 11 [0]. However, to support earlier versions of Clang and for better readability use of VFP assembler mnemonics still is preferred. Replace the coprocessor load/store instructions with explicit assembler mnemonics to accessing the floating point coprocessor registers. Use assembler directives to select the appropriate FPU version. This allows to build these macros with GNU assembler as well as with Clang's built-in assembler. [0] https://reviews.llvm.org/D59733 Link: https://github.com/ClangBuiltLinux/linux/issues/905 Signed-off-by: Stefan Agner <stefan@xxxxxxxx> Signed-off-by: Russell King <rmk+kernel@xxxxxxxxxxxxxxx> Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- arch/arm/include/asm/vfpmacros.h | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) --- a/arch/arm/include/asm/vfpmacros.h +++ b/arch/arm/include/asm/vfpmacros.h @@ -19,23 +19,25 @@ @ read all the working registers back into the VFP .macro VFPFLDMIA, base, tmp + .fpu vfpv2 #if __LINUX_ARM_ARCH__ < 6 - LDC p11, cr0, [\base],#33*4 @ FLDMIAX \base!, {d0-d15} + fldmiax \base!, {d0-d15} #else - LDC p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d0-d15} + vldmia \base!, {d0-d15} #endif #ifdef CONFIG_VFPv3 + .fpu vfpv3 #if __LINUX_ARM_ARCH__ <= 6 ldr \tmp, =elf_hwcap @ may not have MVFR regs ldr \tmp, [\tmp, #0] tst \tmp, #HWCAP_VFPD32 - ldclne p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31} + vldmiane \base!, {d16-d31} addeq \base, \base, #32*4 @ step over unused register space #else VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0 and \tmp, \tmp, #MVFR0_A_SIMD_MASK @ A_SIMD field cmp \tmp, #2 @ 32 x 64bit registers? - ldcleq p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31} + vldmiaeq \base!, {d16-d31} addne \base, \base, #32*4 @ step over unused register space #endif #endif @@ -44,22 +46,23 @@ @ write all the working registers out of the VFP .macro VFPFSTMIA, base, tmp #if __LINUX_ARM_ARCH__ < 6 - STC p11, cr0, [\base],#33*4 @ FSTMIAX \base!, {d0-d15} + fstmiax \base!, {d0-d15} #else - STC p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d0-d15} + vstmia \base!, {d0-d15} #endif #ifdef CONFIG_VFPv3 + .fpu vfpv3 #if __LINUX_ARM_ARCH__ <= 6 ldr \tmp, =elf_hwcap @ may not have MVFR regs ldr \tmp, [\tmp, #0] tst \tmp, #HWCAP_VFPD32 - stclne p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31} + vstmiane \base!, {d16-d31} addeq \base, \base, #32*4 @ step over unused register space #else VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0 and \tmp, \tmp, #MVFR0_A_SIMD_MASK @ A_SIMD field cmp \tmp, #2 @ 32 x 64bit registers? - stcleq p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31} + vstmiaeq \base!, {d16-d31} addne \base, \base, #32*4 @ step over unused register space #endif #endif Patches currently in stable-queue which might be from f.fainelli@xxxxxxxxx are queue-5.4/arm-8971-1-replace-the-sole-use-of-a-symbol-with-its-definition.patch queue-5.4/arm-omap2-drop-unnecessary-adrl.patch queue-5.4/arm-8933-1-replace-sun-solaris-style-flag-on-section-directive.patch queue-5.4/crypto-arm-sha256-neon-avoid-adrl-pseudo-instruction.patch queue-5.4/arm-9029-1-make-iwmmxt.s-support-clang-s-integrated-assembler.patch queue-5.4/net-mscc-ocelot-allow-unregistered-ip-multicast-flooding.patch queue-5.4/crypto-arm-sha512-neon-avoid-adrl-pseudo-instruction.patch queue-5.4/arm-8989-1-use-.fpu-assembler-directives-instead-of-assembler-arguments.patch queue-5.4/crypto-arm-ghash-ce-define-fpu-before-fpu-registers-are-referenced.patch queue-5.4/arm-8929-1-use-apsr_nzcv-instead-of-r15-as-mrc-operand.patch queue-5.4/crypto-arm-use-kconfig-based-compiler-checks-for-crypto-opcodes.patch queue-5.4/arm-8990-1-use-vfp-assembler-mnemonics-in-register-load-store-macros.patch