Re: [RFC PATCH v12 07/33] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace

David Matlack <dmatlack@xxxxxxxxxx> · Tue, 10 Oct 2023 15:21:37 -0700

On Thu, Oct 5, 2023 at 3:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Oct 05, 2023, Anish Moorthy wrote:
> > On Tue, Oct 3, 2023 at 4:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > >
> > > The only way a KVM_EXIT_MEMORY_FAULT that actually reaches userspace could be
> > > "unreliable" is if something other than a memory_fault exit clobbered the union,
> > > but didn't signal its KVM_EXIT_* reason.  And that would be an egregious bug that
> > > isn't unique to KVM_EXIT_MEMORY_FAULT, i.e. the same data corruption would affect
> > > each and every other KVM_EXIT_* reason.
> >
> > Keep in mind the case where an "unreliable" annotation sets up a
> > KVM_EXIT_MEMORY_FAULT, KVM_RUN ends up continuing, then something
> > unrelated comes up and causes KVM_RUN to EFAULT. Although this at
> > least is a case of "outdated" information rather than blatant
> > corruption.
>
> Drat, I managed to forget about that.
>
> > IIRC the last time this came up we said that there's minimal harm in
> > userspace acting on the outdated info, but it seems like another good
> > argument for just restricting the annotations to paths we know are
> > reliable. What if the second EFAULT above is fatal (as I understand
> > all are today) and sets up subsequent KVM_RUNs to crash and burn
> > somehow? Seems like that'd be a safety issue.
>
> For your series, let's omit
>
>   KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page
>
> and just fill memory_fault for the page fault paths.  That will be easier to
> document too since we can simply say that if the exit reason is KVM_EXIT_MEMORY_FAULT,
> then run->memory_fault is valid and fresh.

+1

And from a performance perspective, I don't think we care about
kvm_vcpu_read/write_guest_page(). Our (Google) KVM Demand Paging
implementation just sends any kvm_vcpu_read/write_guest_page()
requests through the netlink socket, which is just a poor man's
userfaultfd. So I think we'll be fine sending these callsites through
uffd instead of exiting out to userspace.

And with that out of the way, is there any reason to keep tying
KVM_EXIT_MEMORY_FAULT to -EFAULT? As mentioned in the patch at the top
of this thread, -EFAULT is just a hack to allow the emulator paths to
return out to userspace. But that's no longer necessary. I just find
it odd that some KVM_EXIT_* correspond with KVM_RUN returning an error
and others don't. The exit_reason is sufficient to tell userspace
what's going on and has a firm contract, unlike -EFAULT which anything
KVM calls into can return.

>
> Adding a flag or whatever to mark the data as trustworthy would be the alternative,
> but that's effectively adding ABI that says "KVM is buggy, sorry".
>
> My dream of having KVM always return useful information for -EFAULT will have to
> wait for another day.