The patch titled Subject: arch: add ARCH_HAS_KERNEL_FPU_SUPPORT has been added to the -mm mm-nonmm-unstable branch. Its filename is arch-add-arch_has_kernel_fpu_support.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/arch-add-arch_has_kernel_fpu_support.patch This patch will later appear in the mm-nonmm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Samuel Holland <samuel.holland@xxxxxxxxxx> Subject: arch: add ARCH_HAS_KERNEL_FPU_SUPPORT Date: Fri, 29 Mar 2024 00:18:16 -0700 Several architectures provide an API to enable the FPU and run floating-point SIMD code in kernel space. However, the function names, header locations, and semantics are inconsistent across architectures, and FPU support may be gated behind other Kconfig options. provide a standard way for architectures to declare that kernel space FPU support is available. Architectures selecting this option must implement what is currently the most common API (kernel_fpu_begin() and kernel_fpu_end(), plus a new function kernel_fpu_available()) and provide the appropriate CFLAGS for compiling floating-point C code. Link: https://lkml.kernel.org/r/20240329072441.591471-2-samuel.holland@xxxxxxxxxx Signed-off-by: Samuel Holland <samuel.holland@xxxxxxxxxx> Suggested-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Cc: Alex Deucher <alexander.deucher@xxxxxxx> Cc: Borislav Petkov (AMD) <bp@xxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Cc: Huacai Chen <chenhuacai@xxxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Masahiro Yamada <masahiroy@xxxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Nathan Chancellor <nathan@xxxxxxxxxx> Cc: Nicolas Schier <nicolas@xxxxxxxxx> Cc: Palmer Dabbelt <palmer@xxxxxxxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: WANG Xuerui <git@xxxxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/core-api/floating-point.rst | 78 ++++++++++++++++++++ Documentation/core-api/index.rst | 1 Makefile | 5 + arch/Kconfig | 6 + include/linux/fpu.h | 12 +++ 5 files changed, 102 insertions(+) --- a/arch/Kconfig~arch-add-arch_has_kernel_fpu_support +++ a/arch/Kconfig @@ -1569,6 +1569,12 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG address translations. Page table walkers that clear the accessed bit may use this capability to reduce their search space. +config ARCH_HAS_KERNEL_FPU_SUPPORT + bool + help + Architectures that select this option can run floating-point code in + the kernel, as described in Documentation/core-api/floating-point.rst. + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig" --- /dev/null +++ a/Documentation/core-api/floating-point.rst @@ -0,0 +1,78 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +Floating-point API +================== + +Kernel code is normally prohibited from using floating-point (FP) registers or +instructions, including the C float and double data types. This rule reduces +system call overhead, because the kernel does not need to save and restore the +userspace floating-point register state. + +However, occasionally drivers or library functions may need to include FP code. +This is supported by isolating the functions containing FP code to a separate +translation unit (a separate source file), and saving/restoring the FP register +state around calls to those functions. This creates "critical sections" of +floating-point usage. + +The reason for this isolation is to prevent the compiler from generating code +touching the FP registers outside these critical sections. Compilers sometimes +use FP registers to optimize inlined ``memcpy`` or variable assignment, as +floating-point registers may be wider than general-purpose registers. + +Usability of floating-point code within the kernel is architecture-specific. +Additionally, because a single kernel may be configured to support platforms +both with and without a floating-point unit, FPU availability must be checked +both at build time and at run time. + +Several architectures implement the generic kernel floating-point API from +``linux/fpu.h``, as described below. Some other architectures implement their +own unique APIs, which are documented separately. + +Build-time API +-------------- + +Floating-point code may be built if the option ``ARCH_HAS_KERNEL_FPU_SUPPORT`` +is enabled. For C code, such code must be placed in a separate file, and that +file must have its compilation flags adjusted using the following pattern:: + + CFLAGS_foo.o += $(CC_FLAGS_FPU) + CFLAGS_REMOVE_foo.o += $(CC_FLAGS_NO_FPU) + +Architectures are expected to define one or both of these variables in their +top-level Makefile as needed. For example:: + + CC_FLAGS_FPU := -mhard-float + +or:: + + CC_FLAGS_NO_FPU := -msoft-float + +Normal kernel code is assumed to use the equivalent of ``CC_FLAGS_NO_FPU``. + +Runtime API +----------- + +The runtime API is provided in ``linux/fpu.h``. This header cannot be included +from files implementing FP code (those with their compilation flags adjusted as +above). Instead, it must be included when defining the FP critical sections. + +.. c:function:: bool kernel_fpu_available( void ) + + This function reports if floating-point code can be used on this CPU or + platform. The value returned by this function is not expected to change + at runtime, so it only needs to be called once, not before every + critical section. + +.. c:function:: void kernel_fpu_begin( void ) + void kernel_fpu_end( void ) + + These functions create a floating-point critical section. It is only + valid to call ``kernel_fpu_begin()`` after a previous call to + ``kernel_fpu_available()`` returned ``true``. These functions are only + guaranteed to be callable from (preemptible or non-preemptible) process + context. + + Preemption may be disabled inside critical sections, so their size + should be minimized. They are *not* required to be reentrant. If the + caller expects to nest critical sections, it must implement its own + reference counting. --- a/Documentation/core-api/index.rst~arch-add-arch_has_kernel_fpu_support +++ a/Documentation/core-api/index.rst @@ -48,6 +48,7 @@ Library functionality that is used throu errseq wrappers/atomic_t wrappers/atomic_bitops + floating-point Low level entry and exit ======================== --- /dev/null +++ a/include/linux/fpu.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _LINUX_FPU_H +#define _LINUX_FPU_H + +#ifdef _LINUX_FPU_COMPILATION_UNIT +#error FP code must be compiled separately. See Documentation/core-api/floating-point.rst. +#endif + +#include <asm/fpu.h> + +#endif --- a/Makefile~arch-add-arch_has_kernel_fpu_support +++ a/Makefile @@ -964,6 +964,11 @@ KBUILD_CFLAGS += $(CC_FLAGS_CFI) export CC_FLAGS_CFI endif +# Architectures can define flags to add/remove for floating-point support +CC_FLAGS_FPU += -D_LINUX_FPU_COMPILATION_UNIT +export CC_FLAGS_FPU +export CC_FLAGS_NO_FPU + ifneq ($(CONFIG_FUNCTION_ALIGNMENT),0) # Set the minimal function alignment. Use the newer GCC option # -fmin-function-alignment if it is available, or fall back to -falign-funtions. _ Patches currently in -mm which might be from samuel.holland@xxxxxxxxxx are x86-fpu-fix-asm-fpu-typesh-include-guard.patch arch-add-arch_has_kernel_fpu_support.patch arm-implement-arch_has_kernel_fpu_support.patch arm-crypto-use-cc_flags_fpu-for-neon-cflags.patch arm64-implement-arch_has_kernel_fpu_support.patch arm64-crypto-use-cc_flags_fpu-for-neon-cflags.patch lib-raid6-use-cc_flags_fpu-for-neon-cflags.patch loongarch-implement-arch_has_kernel_fpu_support.patch powerpc-implement-arch_has_kernel_fpu_support.patch x86-implement-arch_has_kernel_fpu_support.patch riscv-add-support-for-kernel-mode-fpu.patch drm-amd-display-use-arch_has_kernel_fpu_support.patch selftests-fpu-move-fp-code-to-a-separate-translation-unit.patch selftests-fpu-allow-building-on-other-architectures.patch