Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/8/2024 7:40 AM, Dave Hansen wrote:

As I look closer, I'm not sure I think this is a good match for the two
other KFPU_* flags.  I don't think either KFPU_387 or KFPU_MXCSR causes
any XFEATURE to be tracked as init.  The SDM says that FNINIT "sets the
FPU control, status, tag, instruction pointer, and data pointer
registers to their default states."

Off the top of my head, I don't know how that maps to the XSAVE init
tracking but I do know that MXCSR has a very weird mapping to the first
two XSAVE features.

FNINIT does *not* set the component to be tracked as init. Indeed, the SSE component is somewhat convoluted. AMX state will be likely tracked as init. But, I believe this init tracking mechanism is implementation-specific, not architectural. Beyond this intricacy of init-tracking, KFPU_* flags all initialize each state:

/* Kernel FPU states to initialize in kernel_fpu_begin_mask() */
#define KFPU_387	_BITUL(0)	/* 387 state will be initialized */
#define KFPU_MXCSR	_BITUL(1)	/* MXCSR will be initialized */

I really think it would be simplest to just have this whole thing do this:

	kernel_fpu_begin();

	// Zap AMX state

	kernel_fpu_end();

Where the zap is either an os_xrstor() or an explicit tile release
instruction.

If I understand correctly, the change could be something like this, although I'm not sure about this new API naming and prototype at this point:

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a2be3aefff9f..906634402ea6 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -28,6 +28,7 @@

 extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
 extern void kernel_fpu_end(void);
+extern void kernel_fpu_reset(void);
 extern bool irq_fpu_usable(void);
 extern void fpregs_mark_activate(void);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1209c7aebb21..0351f9ee3e49 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -452,6 +452,15 @@ void kernel_fpu_end(void)
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_end);

+void kernel_fpu_reset(void)
+{
+       kernel_fpu_begin();
+       if (cpu_feature_enabled(X86_FEATURE_AMX_TILE))
+               tile_release();
+       kernel_fpu_end();
+}
+EXPORT_SYMBOL(kernel_fpu_reset);
+
 /*
  * Sync the FPU register state to current's memory register state when the
  * current task owns the FPU. The hardware register state is preserved.
diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h
index 56b9f3e3cf76..5e7ba94b4054 100644
--- a/drivers/platform/x86/intel/ifs/ifs.h
+++ b/drivers/platform/x86/intel/ifs/ifs.h
@@ -129,6 +129,7 @@
  */
 #include <linux/device.h>
 #include <linux/miscdevice.h>
+#include <asm/fpu/api.h>

 #define MSR_ARRAY_BIST                         0x00000105
 #define MSR_COPY_SCAN_HASHES                   0x000002c2
diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c
index 95b4b71fab53..33ef4962aeba 100644
--- a/drivers/platform/x86/intel/ifs/runtest.c
+++ b/drivers/platform/x86/intel/ifs/runtest.c
@@ -188,6 +188,8 @@ static int doscan(void *data)
        /* Only the first logical CPU on a core reports result */
        first = cpumask_first(cpu_smt_mask(cpu));

+       kernel_fpu_reset();
+
        wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC);

        /*

It's just a matter of whether you want this to work like a regular old
kernel FPU user or you want to tie it to *only* being able to run in its
own kernel thread. -- Aside: I don't think I realized IFS had its own
thread earlier.

While both approaches aim for the same functionality, the code change seems to be smaller for the above. At the same time, it is notable that these new APIs, fpu_idle_fpregs() and the other, are *only* for AMX.

1. The diff state of this series' approach:
 arch/x86/include/asm/fpu/api.h           |  1 +
 arch/x86/kernel/fpu/core.c               |  3 +++
 drivers/platform/x86/intel/ifs/ifs.h     | 15 +++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c |  6 ++++++
 4 files changed, 25 insertions(+)

2. The above code changes:
 arch/x86/include/asm/fpu/api.h           | 1 +
 arch/x86/kernel/fpu/core.c               | 9 +++++++++
 drivers/platform/x86/intel/ifs/ifs.h     | 1 +
 drivers/platform/x86/intel/ifs/runtest.c | 2 ++
 4 files changed, 13 insertions(+)

Thanks,
Chang




[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux