On Thu, Jan 12, 2023, Chang S. Bae wrote: > On 1/12/2023 11:17 AM, Mingwei Zhang wrote: > > > > But the permitted_xcr0 and supported_xcr0 seems never used > > directly as the parameter of XSETBV, but only for size calculation. > > Yeah, I saw that too, and tried to improve it [1]. Maybe this is not a big > deal in KVM. > > > One more question: I am very confused by the implementation of > > xstate_get_guest_group_perm(). It seems fetching the xfeatures from the > > host process (¤t->group_leader->thread.fpu). Is this intentional? > > Does that mean in order to enable AMX in the guest we have to enable it > > on the host process first? > > Yes, it was designed that QEMU requests permissions via arch_prctl() before > creating vCPU threads [2][3]. Granted, this feature capability will be > advertised to the guest. Then, it will *enable* the feature, right? > > Thanks, > Chang > > [1] > https://lore.kernel.org/kvm/20220823231402.7839-2-chang.seok.bae@xxxxxxxxx/ > [2] https://lore.kernel.org/lkml/87wnmf66m5.ffs@tglx/ > [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=980fe2fddcff > Thanks for the clarification. Yeah, that was out of my expectation since I assumed AMX enabling in the guest should be orthogonal to the enabling in the host. But since AMX requires dynamic size of fp_state, host awareness of larger fp_state is highly intended. The only comment I would have is that it seems not following the least privilege principle as host process (QEMU) may not have the motivation to do any matrix multiplication. But this is a minor one. Since this enabling once per-process, I am wondering when after invocation of arch_prctl(2), all of the host threads will have a larger fp_state? If so, that might be a sizeable overhead since host userspace may have lots of threads doing various of other things, i.e., they may not be vCPU threads.