On Fri, Jul 07, 2023, Anish Moorthy wrote: > > Hmm, well not having to modify the vendor code would be nice... but > > I'll have to look more at __gfn_to_pfn_memslot()'s callers (and > > probably send more questions your way :). Hopefully it works out more > > like what you suggest. > > I took a look of my own, and I don't think moving the nowait query > into __gfn_to_pfn_memslot() would work. At issue is the actual > behavior of KVM_CAP_NOWAIT_ON_FAULT, which I documented as follows: > > > The presence of this capability indicates that userspace may pass the > > KVM_MEM_NOWAIT_ON_FAULT flag to KVM_SET_USER_MEMORY_REGION to cause KVM_RUN > > to fail (-EFAULT) in response to page faults for which resolution would require > > the faulting thread to sleep. Well, that description is wrong for other reasons. As mentioned in my reply (got snipped), the behavior is not tied to sleeping or waiting on I/O. > Moving the nowait check out of __kvm_faultin_pfn()/user_mem_abort() > and into __gfn_to_pfn_memslot() means that, obviously, other callers > will start to see behavior changes. Some of that is probably actually > necessary for that documentation to be accurate (since any usages of > __gfn_to_pfn_memslot() under KVM_RUN should respect the memslot flag), > but I think there are consumers of __gfn_to_pfn_memslot() from outside > KVM_RUN. Yeah, replace "in response to page faults" with something along the lines of "if an access in guest context ..." > Anyways, after some searching on my end: I think the only caller of > __gfn_to_pfn_memslot() in core kvm/x86/arm64 where moving the "nowait" > check into the function actually changes anything is gfn_to_pfn(). But > that function gets called from vmx_vcpu_create() (through > kvm_alloc_apic_access_page()), and *that* certainly doesn't look like > something KVM_RUN does or would ever call. Correct, but that particular gfn_to_pfn() works on a KVM-internal memslot, i.e. will never have the "fast-only" flag set. hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, <=== APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); if (IS_ERR(hva)) { ret = PTR_ERR(hva); goto out; } page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); if (is_error_page(page)) { ret = -EFAULT; goto out; } On x86, there should not be any other usages of user memslots outside of KVM_RUN. arm64 is unfortunately a different story (see this thread[*]), but we may be able to solve that with a documentation update. I *think* the accesses are limited to the sub-ioctl KVM_DEV_ARM_VGIC_GRP_CTRL, and more precisely the sub-sub-ioctls KVM_DEV_ARM_ITS_{SAVE,RESTORE}_TABLES and KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES. [*] https://lore.kernel.org/all/Y1ghIKrAsRFwSFsO@xxxxxxxxxx