On Tue, Aug 23, 2022, Chang S. Bae wrote: > == Background == > > A set of architecture-specific prctl() options offer to control dynamic > XSTATE components in VCPUs. Userspace VMMs may interact with the host using > ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM. > > However, they are separated from the KVM API. KVM may select features that > the host supports and advertise them through the KVM_X86_XCOMP_GUEST_SUPP > attribute. > > == Problem == > > QEMU [1] queries the features through the KVM API instead of using the x86 > arch_prctl() option. But it still needs to use arch_prctl() to request the > permission. Then this step may become fragile because it does not guarantee > to comply with the KVM policy. But backdooring through KVM doesn't prevent usersepace from walking in through the front door (arch_prctl()), i.e. this doesn't protect the kernel in any way. KVM needs to ensure that _KVM_ doesn't screw up and let userspace use features that KVM doesn't support. The kernel's restrictions on using features goes on top, i.e. KVM must behave correctly irrespective of kernel restrictions. If QEMU wants to assert that it didn't misconfigure itself, it can assert on the config in any number of ways, e.g. assert that ARCH_GET_XCOMP_GUEST_PERM is a subset of KVM_X86_XCOMP_GUEST_SUPP at the end of kvm_request_xsave_components().