Re: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

Andy Lutomirski <luto@xxxxxxxxxx> · Tue, 24 Nov 2020 15:41:16 -0800

> On Nov 24, 2020, at 10:51 AM, Len Brown <lenb@xxxxxxxxxx> wrote:
>
> On Fri, Nov 20, 2020 at 12:03 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>
>>> On Thu, Nov 19, 2020 at 3:37 PM Chang S. Bae <chang.seok.bae@xxxxxxxxx> wrote:
>>> "xstate.enable=0x60000" will enable AMX on a system that does NOT have AMX
>>> compiled into XFEATURE_MASK_USER_ENABLED (assuming the kernel is new enough
>>> to support this feature).
>>>
>>
>> What's the purpose of xstate.enable?  I can't really imagine it's
>> useful for AMX.  I suppose it could be useful for hypothetical
>> post-AMX features, but that sounds extremely dangerous.  Intel has
>> changed its strategy so many times on XSTATE extensibility that I find
>> it quite hard to believe that supporting unknown states is wise.
>
> Not hypothetical -- there are subsequent hardware features coming that
> will use the same
> exact XSTATE support that this series puts in place for AMX.
>
> We know that when those features ship in new hardware, there will be
> a set of customers who want to exercise those features immediately,
> but their kernel binary has not yet been re-compiled to see those
> features by-default.
>
> The purpose of "xstate.enable" is to empower those users to be able to
> explicitly enable support using their existing binary.
>
> You are right -- the feature isn't needed to enable AMX, unless somebody went to
> the trouble of building a kernel with the AMX source update, but chose
> to disable
> AMX-specific recognition, by-default.
>
>

We may want to taint the kernel if one of these flags is used because,
frankly, Intel’s track record is poor. Suppose we get a new feature
with PKRU-like semantics -- switching it blindly using
XSAVE(C,S,OPT,whatever) would simply incorrect. And XFD itself has
problems — supposedly it’s difficult or impossible to virtualize. It
wouldn’t utterly shock me if Intel were to drop IA32_XFD_ERR and
replace it with a new mechanism that’s less janky.

So this whole thing makes me quite nervous.

(I'm not a virtualization expert, but AIUI IA32_XFD_ERR has some
issues.  If it's too late to fix those issues, Intel could probably
get away with completely dropping IA32_XFD_ERR from the spec -- OSes
can handle AMX just fine without it.  Then the next XFD-able feature
could introduce a new improved way of reporting which feature
triggered #NM.)