On Fri, Sep 15, 2023, Weijiang Yang wrote: > On 9/15/2023 1:40 AM, Dave Hansen wrote: > > On 9/13/23 23:33, Yang Weijiang wrote: > > > --- a/arch/x86/kernel/fpu/xstate.c > > > +++ b/arch/x86/kernel/fpu/xstate.c > > > @@ -1636,9 +1636,17 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest) > > > /* Calculate the resulting kernel state size */ > > > mask = permitted | requested; > > > - /* Take supervisor states into account on the host */ > > > + /* > > > + * Take supervisor states into account on the host. And add > > > + * kernel dynamic xfeatures to guest since guest kernel may > > > + * enable corresponding CPU feaures and the xstate registers > > > + * need to be saved/restored properly. > > > + */ > > > if (!guest) > > > mask |= xfeatures_mask_supervisor(); > > > + else > > > + mask |= fpu_kernel_dynamic_xfeatures; This looks wrong. Per commit 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE permissions"), mask at this point only contains user features, which somewhat unintuitively doesn't include CET_USER (I get that they're MSRs and thus supervisor state, it's just the name that's odd). IIUC, the "dynamic" features contains CET_KERNEL, whereas xfeatures_mask_supervisor() conatins PASID, CET_USER, and CET_KERNEL. PASID isn't virtualized by KVM, but doesn't that mean CET_USER will get dropped/lost if userspace requests AMX/XTILE enabling? The existing code also seems odd, but I might be missing something. Won't the kernel drop PASID if the guest request AMX/XTILE? I'm not at all familiar with what PASID state is managed via XSAVE, so I've no idea if that's an actual problem or just an oddity. > > > ksize = xstate_calculate_size(mask, compacted); > > Heh, you changed the "guest" naming in "fpu_kernel_dynamic_xfeatures" > > but didn't change the logic. > > > > As it's coded at the moment *ALL* "fpu_kernel_dynamic_xfeatures" are > > guest xfeatures. So, they're different in name only. ... > > Would there ever be any reason for KVM to be on a system which supports a > > dynamic kernel feature but where it doesn't get enabled for guest use, or > > at least shouldn't have the FPU space allocated? > > I haven't heard of that kind of usage for other features so far, CET > supervisor xstate is the only dynamic kernel feature now, not sure whether > other CPU features having supervisor xstate would share the handling logic > like CET does one day. There are definitely scenarios where CET will not be exposed to KVM guests, but I don't see any reason to make the guest FPU space dynamically sized for CET. It's what, 40 bytes? I would much prefer to avoid the whole "dynamic" thing and instead make CET explicitly guest-only. E.g. fpu_kernel_guest_only_xfeatures? Or even better if it doesn't cause weirdness elsewhere, a dedicated fpu_guest_cfg. For me at least, a fpu_guest_cfg would make it easier to understand what all is going on.