Re: [PATCH v3 00/22] Improve scalability of KVM + userfaultfd live migration via annotated memory faults.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 10, 2023 at 4:44 PM Anish Moorthy <amoorthy@xxxxxxxxxx> wrote:
>
> On Wed, May 10, 2023 at 3:35 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > Yeah, when I speed read the series, several of the conversions stood out as being
> > "wrong".  My (potentially unstated) idea was that KVM would only signal
> > KVM_EXIT_MEMORY_FAULT when the -EFAULT could be traced back to a user access,
> > i.e. when the fault _might_ be resolvable by userspace.
>
> Sean, besides direct_map which other patches did you notice as needing
> to be dropped/marked as unrecoverable errors?

I tried going through on my own to try and identify the incorrect
annotations: here's my read.

Correct (or can easily be corrected)
-----------------------------------------------
- user_mem_abort
  Incorrect as is: the annotations in patch 19 are incorrect, as they
cover an error-on-no-slot case and one more I don't fully understand:
the one in patch 20 should be good though.

- kvm_vcpu_read/write_guest_page:
  Incorrect as-is, but can fixed: the current annotations cover
gpa_to_hva_memslot(_prot) failures, which can happen when "gpa" is not
converted by a memslot. However we can leave these as bare efaults and
just annotate the copy_to/from_user failures, which userspace should
be able to resolve by checking/changing the slot permissions.

- kvm_handle_error_pfn
  Correct: at the annotation point, the fault must be either a (a)
read/write to a writable memslot or (b) read from a readable one.
hva_to_pfn must have returned KVM_PFN_ERR_FAULT, which userspace can
attempt to resolve using a MADV

Flatly Incorrect (will drop in next version)
-----------------------------------------------
- kvm_handle_page_fault
  efault corresponds to a kernel bug not resolvable by userspace

- direct_map
  Same as above

- kvm_mmu_page_fault
  Not a "leaf" return of efault, Also, the
check-for-efault-and-annotate here catches efaults which userspace can
do nothing about: such as the one from direct_map [1]

Unsure (Switch kvm_read/write_guest to kvm_vcpu_read/write_guest?)
-----------------------------------------------

- setup_vmgexit_scratch and kvm_pv_clock_pairing
  These efault on errors from kvm_read/write_guest, and theoretically
it does seem to make sense to annotate them. However, the annotations
are incorrect as is for the same reason that the
kvm_vcpu_read/write_guest_page need to be corrected.

In fact, the kvm_read/write_guest calls are of the form
"kvm_read_guest(vcpu->kvm, ...)": if we switched these calls to
kvm_vcpu_read/write_guest instead, then it seems like we'd get correct
annotations for free. Would it be correct to make this switch? If not,
then perhaps an optional kvm_vcpu* parameter for the "non-vcpu"
read/write functions strictly for annotation purposes? That seems
rather ugly though...

Unsure (Similar-ish to above)
-----------------------------------------------

- kvm_hv_get_assist_page
  Incorrect as-is. The existing annotation would cover some efaults
which it doesn't seem likely that userspace can resolve [2]. Right
after those though, there's a copy_from_user which it could make sense
to annotate.

The efault here comes from failures of
kvm_read_guest_cached/kvm_read_guest_offset_cached, for which all of
the calls are again of the form "f(vcpu->kvm, ...)". Again, we'll need
either an (optional) vcpu parameter or to refactor these to just take
a "kvm_vcpu" instead if we want to annotate just the failing
uaccesses.

PS: I plan to add a couple of flags to the memory fault exit to
identify whether the failed access was a read/write/exec


[1] https://github.com/torvalds/linux/blob/v6.3/arch/x86/kvm/mmu/mmu.c#L3196
[2] https://github.com/torvalds/linux/blob/v6.3/virt/kvm/kvm_main.c#L3261-L3270




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux