Re: [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 20, 2023 at 8:53 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Mar 17, 2023, Anish Moorthy wrote:
> > On Fri, Mar 17, 2023 at 2:50 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > I wonder if we can get away with returning -EFAULT, but still filling vcpu->run
> > > with KVM_EXIT_MEMORY_FAULT and all the other metadata.  That would likely simplify
> > > the implementation greatly, and would let KVM fill vcpu->run unconditonally.  KVM
> > > would still need a capability to advertise support to userspace, but userspace
> > > wouldn't need to opt in.  I think this may have been my very original though, and
> > > I just never actually wrote it down...
> >
> > Oh, good to know that's actually an option. I thought of that too, but
> > assumed that returning a negative error code was a no-go for a proper
> > vCPU exit. But if that's not true then I think it's the obvious
> > solution because it precludes any uncaught behavior-change bugs.
> >
> > A couple of notes
> > 1. Since we'll likely miss some -EFAULT returns, we'll need to make
> > sure that the user can check for / doesn't see a stale
> > kvm_run::memory_fault field when a missed -EFAULT makes it to
> > userspace. It's a small and easy-to-fix detail, but I thought I'd
> > point it out.
>
> Ya, this is the main concern for me as well.  I'm not as confident that it's
> easy-to-fix/avoid though.
>
> > 2. I don't think this would simplify the series that much, since we
> > still need to find the call sites returning -EFAULT to userspace and
> > populate memory_fault only in those spots to avoid populating it for
> > -EFAULTs which don't make it to userspace.
>
> Filling kvm_run::memory_fault even if KVM never exits to userspace is perfectly
> ok.  It's not ideal, but it's ok.
>
> > We *could* relax that condition and just document that memory_fault should be
> > ignored when KVM_RUN does not return -EFAULT... but I don't think that's a
> > good solution from a coder/maintainer perspective.
>
> You've got things backward.  memory_fault _must_ be ignored if KVM doesn't return
> the associated "magic combo", where the magic value is either "0+KVM_EXIT_MEMORY_FAULT"
> or "-EFAULT+KVM_EXIT_MEMORY_FAULT".
>
> Filling kvm_run::memory_fault but not exiting to userspace is ok because userspace
> never sees the data, i.e. userspace is completely unaware.  This behavior is not
> ideal from a KVM perspective as allowing KVM to fill the kvm_run union without
> exiting to userspace can lead to other bugs, e.g. effective corruption of the
> kvm_run union, but at least from a uABI perspective, the behavior is acceptable.

Actually, I don't think the idea of filling in kvm_run.memory_fault
for -EFAULTs which don't make it to userspace works at all. Consider
the direct_map function, which bubbles its -EFAULT to
kvm_mmu_do_page_fault. kvm_mmu_do_page_fault is called from both
kvm_arch_async_page_ready (which ignores the return value), and by
kvm_mmu_page_fault (where the return value does make it to userspace).
Populating kvm_run.memory_fault anywhere in or under
kvm_mmu_do_page_fault seems an immediate no-go, because a wayward
kvm_arch_async_page_ready could (presumably) overwrite/corrupt an
already-set kvm_run.memory_fault / other kvm_run field.

That in turn looks problematic for the
memory-fault-exit-on-fast-gup-failure part of this series, because
there are at least a couple of cases for which kvm_mmu_do_page_fault
will -EFAULT. One is the early-efault-on-fast-gup-failure case which
was the original purpose of this series. Another is a -EFAULT from
FNAME(fetch) (passed up through FNAME(page_fault)). There might be
other cases as well. But unless userspace can/should resolve *all*
such -EFAULTs in the same manner, a kvm_run.memory_fault populated in
"kvm_mmu_page_fault" wouldn't be actionable. At least, not without a
whole lot of plumbing code to make it so.

Sean, am I missing anything here?




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux