On Tue, Oct 10, 2023, David Matlack wrote: > On Thu, Oct 5, 2023 at 3:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Thu, Oct 05, 2023, Anish Moorthy wrote: > > > On Tue, Oct 3, 2023 at 4:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > > > > > The only way a KVM_EXIT_MEMORY_FAULT that actually reaches userspace could be > > > > "unreliable" is if something other than a memory_fault exit clobbered the union, > > > > but didn't signal its KVM_EXIT_* reason. And that would be an egregious bug that > > > > isn't unique to KVM_EXIT_MEMORY_FAULT, i.e. the same data corruption would affect > > > > each and every other KVM_EXIT_* reason. > > > > > > Keep in mind the case where an "unreliable" annotation sets up a > > > KVM_EXIT_MEMORY_FAULT, KVM_RUN ends up continuing, then something > > > unrelated comes up and causes KVM_RUN to EFAULT. Although this at > > > least is a case of "outdated" information rather than blatant > > > corruption. > > > > Drat, I managed to forget about that. > > > > > IIRC the last time this came up we said that there's minimal harm in > > > userspace acting on the outdated info, but it seems like another good > > > argument for just restricting the annotations to paths we know are > > > reliable. What if the second EFAULT above is fatal (as I understand > > > all are today) and sets up subsequent KVM_RUNs to crash and burn > > > somehow? Seems like that'd be a safety issue. > > > > For your series, let's omit > > > > KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page > > > > and just fill memory_fault for the page fault paths. That will be easier to > > document too since we can simply say that if the exit reason is KVM_EXIT_MEMORY_FAULT, > > then run->memory_fault is valid and fresh. > > +1 > > And from a performance perspective, I don't think we care about > kvm_vcpu_read/write_guest_page(). Our (Google) KVM Demand Paging > implementation just sends any kvm_vcpu_read/write_guest_page() > requests through the netlink socket, which is just a poor man's > userfaultfd. So I think we'll be fine sending these callsites through > uffd instead of exiting out to userspace. > > And with that out of the way, is there any reason to keep tying > KVM_EXIT_MEMORY_FAULT to -EFAULT? As mentioned in the patch at the top > of this thread, -EFAULT is just a hack to allow the emulator paths to > return out to userspace. But that's no longer necessary. Not forcing '0' makes handling other error codes simpler, e.g. if the memory is poisoned, KVM can simply return -EHWPOISON instead of having to add a flag to run->memory_fault[*]. KVM would also have to make returning '0' instead of -EFAULT conditional based on a capability being enabled. And again, committing to returning '0' will make it all but impossible to extend KVM_EXIT_MEMORY_FAULT beyond the page fault handlers. Well, I suppose we could have the top level kvm_arch_vcpu_ioctl_run() do if (r == -EFAULT && vcpu->kvm->enable_memory_fault_exits && kvm_run->exit_reason == KVM_EXIT_MEMORY_FAULT) r = 0; but that's quite gross IMO. > I just find it odd that some KVM_EXIT_* correspond with KVM_RUN returning an > error and others don't. FWIW, there is already precedent for run->exit_reason being valid with a non-zero error code. E.g. KVM selftests relies on run->exit_reason being preserved when forcing an immediate exit, which returns -EINTR, not '0'. if (kvm_run->immediate_exit) { r = -EINTR; goto out; } And pre-immediate_exit code that relies on signalling vCPUs is even more explicit in setting exit_reason with a non-zero errno: if (signal_pending(current)) { r = -EINTR; kvm_run->exit_reason = KVM_EXIT_INTR; ++vcpu->stat.signal_exits; } I agree that -EFAULT with KVM_EXIT_MEMORY_FAULT *looks* a little odd, but IMO the existing KVM behavior of returning '0' is actually what's truly odd. E.g. returning '0' + KVM_EXIT_MMIO if the guest accesses non-existent memory is downright weird. KVM_RUN should arguably never return '0', because it can never actual completely succeed. > The exit_reason is sufficient to tell userspace what's going on and has a > firm contract, unlike -EFAULT which anything KVM calls into can return. Eh, I don't think it lessens the contract in a meaningful way. KVM is still contractually obligated to fill run->exit_reason when KVM returns '0', and userspace will still likely terminate the VM on an undocumented EFAULT/EHWPOISON. E.g. if KVM has a bug and doesn't return KVM_EXIT_MEMORY_FAULT when handling a page fault, then odds are very good that the bug would result in KVM returning a "bare" -EFAULT regardless of whether KVM_EXIT_MEMORY_FAULT is paried with '0' or -EFAULT. [*] https://lore.kernel.org/all/ZQHzVOIsesTTysgf@xxxxxxxxxx