On Tue, Jul 11, 2023 at 8:29 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Well, that description is wrong for other reasons. As mentioned in my reply > (got snipped), the behavior is not tied to sleeping or waiting on I/O. > > > Moving the nowait check out of __kvm_faultin_pfn()/user_mem_abort() > > and into __gfn_to_pfn_memslot() means that, obviously, other callers > > will start to see behavior changes. Some of that is probably actually > > necessary for that documentation to be accurate (since any usages of > > __gfn_to_pfn_memslot() under KVM_RUN should respect the memslot flag), > > but I think there are consumers of __gfn_to_pfn_memslot() from outside > > KVM_RUN. > > Yeah, replace "in response to page faults" with something along the lines of "if > an access in guest context ..." Alright, how about + KVM_MEM_NO_USERFAULT_ON_GUEST_ACCESS + The presence of this capability indicates that userspace may pass the + KVM_MEM_NO_USERFAULT_ON_GUEST_ACCESS flag to + KVM_SET_USER_MEMORY_REGION. Said flag will cause KVM_RUN to fail (-EFAULT) + in response to guest-context memory accesses which would require KVM + to page fault on the userspace mapping. Although, as Wang mentioned, USERFAULT seems to suggest something related to userfaultfd which is a liiiiitle too specific. Perhaps we should use USERSPACE_FAULT (*cries*) instead? On Wed, Jun 14, 2023 at 2:20 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > However, do we actually need to force vendor code to query nowait? At a glance, > the only external (relative to kvm_main.c) users of __gfn_to_pfn_memslot() are > in flows that play nice with nowait or that don't support it at all. So I *think* > we can do this? On Wed, Jun 14, 2023 at 2:23 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Gah, got turned around and forgot to account for @atomic. So this? > > if (!atomic && memslot_is_nowait_on_fault(slot)) { > atomic = true; > if (async) { > *async = false; > async = NULL; > } > } > > > + > > return hva_to_pfn(addr, atomic, interruptible, async, write_fault, > > writable); > > } Took another look at this and I *think* it works too (I added my notes at the end here so if anyone wants to double-check they can). But there are a couple of quirks 1. The memslot flag can cause new __gfn_to_pfn_memslot() failures in kvm_vcpu_map() (good thing!) but those failures result in an EINVAL (bad!). It kinda looks like the current is_error_noslot_pfn() check in that function should be returning EFAULT anyways though, any opinions? 2. kvm_vm_ioctl_mte_copy_tags() will see new failures. This function has come up before (a) and it doesn't seem like an access in a guest context. Is this something to just be documented away? 3. I don't think I've caught parts of the who-calls tree that are in drivers/. The one part I know about is the gfn_to_pfn() call in is_2MB_gtt_possible() (drivers/gpu/drm/i915/gvt/gtt.c), and I'm not sure what to do about it. Again, doesn't look like a guest-context access. (a) https://lore.kernel.org/kvm/ZIoiLGotFsDDvRN3@xxxxxxxxxx/T/#u --------------------------------------------------- Notes: Tracing the usages of __gfn_to_pfn_memslot() "shove" = "moving the nowait check inside of __gfn_to_pfn_memslot * [x86] __gfn_to_pfn_memslot() has 5 callers ** DONE kvm_faultin_pfn() accounts for two calls, shove will cause bail (as intended) after first ** DONE __gfn_to_pfn_prot(): No usages on x86 ** DONE __gfn_to_pfn_memslot_atomic: already requires atomic access :) ** gfn_to_pfn_memslot() has two callers *** DONE kvm_vcpu_gfn_to_pfn(): No callers *** gfn_to_pfn() has two callers **** TODO kvm_vcpu_map() When memslot flag trips will get KVM_PFN_ERR_FAULT, error is handled HOWEVER it will return -EINVAL which is kinda... not right **** gfn_to_page() has two callers, but both operate on APIC_DEFAULT_PHYS_BASE addr ** Ok so that's "done," as long as my LSP is reliable * [arm64] __gfn_to_pfn_memslot() has 4 callers ** DONE user_mem_abort() has one, accounted for by the subsequent is_error_noslot_pfn() ** DONE __gfn_to_pfn_memslot_atomic() fine as above ** TODO gfn_to_pfn_prot() One caller in kvm_vm_ioctl_mte_copy_tags() There's a is_error_noslot_pfn() to catch the failure, but there's no vCPU floating around to annotate a fault in! ** gfn_to_pfn_memslot() two callers *** DONE kvm_vcpu_gfn_to_pfn() no callers *** gfn_to_pfn() two callers **** kvm_vcpu_map() as above **** DONE gfn_to_page() no callers * TODO Weird driver code reference I discovered only via ripgrep?