On Mon, Mar 20, 2023 at 8:53 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Fri, Mar 17, 2023, Anish Moorthy wrote: > > On Fri, Mar 17, 2023 at 2:50 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > I wonder if we can get away with returning -EFAULT, but still filling vcpu->run > > > with KVM_EXIT_MEMORY_FAULT and all the other metadata. That would likely simplify > > > the implementation greatly, and would let KVM fill vcpu->run unconditonally. KVM > > > would still need a capability to advertise support to userspace, but userspace > > > wouldn't need to opt in. I think this may have been my very original though, and > > > I just never actually wrote it down... > > > > Oh, good to know that's actually an option. I thought of that too, but > > assumed that returning a negative error code was a no-go for a proper > > vCPU exit. But if that's not true then I think it's the obvious > > solution because it precludes any uncaught behavior-change bugs. > > > > A couple of notes > > 1. Since we'll likely miss some -EFAULT returns, we'll need to make > > sure that the user can check for / doesn't see a stale > > kvm_run::memory_fault field when a missed -EFAULT makes it to > > userspace. It's a small and easy-to-fix detail, but I thought I'd > > point it out. > > Ya, this is the main concern for me as well. I'm not as confident that it's > easy-to-fix/avoid though. > > > 2. I don't think this would simplify the series that much, since we > > still need to find the call sites returning -EFAULT to userspace and > > populate memory_fault only in those spots to avoid populating it for > > -EFAULTs which don't make it to userspace. > > Filling kvm_run::memory_fault even if KVM never exits to userspace is perfectly > ok. It's not ideal, but it's ok. Right- I was just pointing out that doing so could mislead readers of the code if they assume that kvm_run::memory_fault is populated iff it was going to be associated w/ an exit to userspace," which I know I would. > > We *could* relax that condition and just document that memory_fault should be > > ignored when KVM_RUN does not return -EFAULT... but I don't think that's a > > good solution from a coder/maintainer perspective. > > You've got things backward. memory_fault _must_ be ignored if KVM doesn't return > the associated "magic combo", where the magic value is either "0+KVM_EXIT_MEMORY_FAULT" > or "-EFAULT+KVM_EXIT_MEMORY_FAULT". I think we're saying the same thing- I was using "should" to mean "must." > Filling kvm_run::memory_fault but not exiting to userspace is ok because userspace > never sees the data, i.e. userspace is completely unaware. This behavior is not > ideal from a KVM perspective as allowing KVM to fill the kvm_run union without > exiting to userspace can lead to other bugs, e.g. effective corruption of the > kvm_run union Ooh, I didn't think of the corruption issue here: thanks for pointing it out. > but at least from a uABI perspective, the behavior is acceptable. This does complicate things for KVM implementation though, right? In particular, we'd have to make sure that KVM_RUN never conditionally modifies its return value/exit reason based on reads from kvm_run: that seems like a slightly weird thing to do, but I don't want to assume anything here. Anyways, unless that's not (and never will be) a problem, allowing corruption of kvm_run seems very risky. > The reverse, userspace consuming kvm_run::memory_fault without being explicitly > told the data is valid, is not ok/safe. KVM's contract is that fields contained > in kvm_run's big union are valid if and only if KVM returns '0' and the associated > exit reason is set in kvm_run::exit_reason. > > From an ABI perspective, I don't see anything fundamentally wrong with bending > that rule slightly by saying that kvm_run::memory_fault is valid if KVM returns > -EFAULT+KVM_EXIT_MEMORY_FAULT. It won't break existing userspace that is unaware > of KVM_EXIT_MEMORY_FAULT, and userspace can precisely check for the combination. > > My big concern with piggybacking -EFAULT is that userspace will be fed stale if > KVM exits with -EFAULT in a patch that _doesn't_ fill kvm_run::memory_fault. > Returning a negative error code isn't hazardous in and of itself, e.g. KVM has > had bugs in the past where KVM returns '0' but doesn't fill kvm_run::exit_reason. > The big danger is that KVM has existing paths that return -EFAULT, i.e. we can > introduce bugs simply by doing nothing, whereas returning '0' would largely be > limited to new code. > > The counter-argument is that propagating '0' correctly up the stack carries its > own risk due to plenty of code correctly treating '0' as "success" and not "exit > to userspace". > > And we can mitigate the risk of using -EFAULT. E.g. fill in kvm_run::memory_fault > even if we are 99.9999% confident the -EFAULT can't get out to userspace in the > context of KVM_RUN, and set kvm_run::exit_reason to some arbitrary value at the > start of KVM_RUN to prevent reusing memory_fault from a previous userspace exit. Right, this is what I had in mind when I called this "small and easy-to-fix." Piggybacking -EFAULT seems like the right thing to do to me, but I'm still uneasy about possibly corrupting kvm_run for masked -EFAULTS.