Re: [PATCH v2 1/2] x86/fpu: Extend kernel_fpu_begin_mask() to initialize AMX state

Dave Hansen <dave.hansen@xxxxxxxxx> · Wed, 8 May 2024 12:11:02 -0700

On 5/8/24 11:03, Chang S. Bae wrote:
> On 5/8/2024 7:40 AM, Dave Hansen wrote:
>> On 5/7/24 16:53, Chang S. Bae wrote:
>>
>>> However, due to resource constraints in storage, AMX state is excluded
>>> from the scope of state recovery. Consequently, AMX state must be in its
>>> initialized state for the IFS test to run.
>>
>> This doesn't mention how this issue got introduced.  Are we all bad at
>> reading the SDM? :)
> 
> Ah, I'd rather zap out this SDM sentence.

My point is that this is fixing a bug.  Where did that bug come from?
What got screwed up here?

Hint: I don't think us software folks screwed up here.  It was likely
the folks that built the two hardware features (AMX and IFS) forgot to
talk to each other, or someone forgot to document the AMX clobbering
aspect of the architecture.

>>> When AMX workloads are running, an active user AMX state remains even
>>> after a context switch, optimizing to reduce the state reload cost. In
>>> such cases, the test cannot proceed if it is scheduled.
>>
>> This is a bit out of the blue.  What does scheduling have do do with IFS?
...
> So, the CPU stopper threads for <cpu#> and its sibling to execute
> doscan() are queued up with the highest priority.
...

But this is the IFS implementation *today*.  The explanation depends on
IFS being implemented with something that context switches.  It also
depends on folks expecting context switches to always switch FPU state.

I'd just say:

	The kernel generally runs with live user FPU state, including
	AMX. That state can prevent IFS tests from running.

That's _much_ more simple, generic and also fully explains the
situation.  It also isn't dependent on the IFS stop_cpus_run()
implementation of today, which could totally change tomorrow.

The underlying rule has zero to do with scheduling or context switching
optimizations.