On Wed, Aug 16, 2023, Anish Moorthy wrote: > > but the names just need to be unique, e.g. the below compiles just fine. So unless > > someone has a better idea, using a separate union for exits that might be clobbered > > seems like the way to go. > > Agreed. By the way, what was the reason why you wanted to avoid the > exit reason canary being ABI? Because it doesn't need to be exposed to usersepace, and it would be quite unfortunate if we ever need/want to drop the canary, but can't because it's exposed to userspace. Though I have no idea why I suggested it be placed in kvm_run, the canary can simply go in kvm_vcpu. I'm guessing I was going for code locality, but abusing an #ifdef to achieve that is silly. > On Wed, Jun 14, 2023 at 10:35 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > And rather than use kvm_run.exit_reason as the canary, we should carve out a > > kernel-only, i.e. non-ABI, field from the union. That would allow setting the > > canary in common KVM code, which can't be done for kvm_run.exit_reason because > > some architectures, e.g. s390 (and x86 IIRC), consume the exit_reason early on > > in KVM_RUN. > > > > E.g. something like this (the #ifdefs are heinous, it might be better to let > > userspace see the exit_canary, but make it abundantly clear that it's not ABI). > > > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > index 143abb334f56..233702124e0a 100644 > > --- a/include/uapi/linux/kvm.h > > +++ b/include/uapi/linux/kvm.h > > @@ -511,16 +511,43 @@ struct kvm_run { > > + /* > > + * This second KVM_EXIT_* union holds structures for exits that may be > > + * triggered after KVM has already initiated a different exit, and/or > > + * may be filled speculatively by KVM. E.g. because of limitations in > > + * KVM's uAPI, a memory fault can be encountered after an MMIO exit is > > + * initiated and kvm_run.mmio is filled. Isolating these structures > > + * from the primary KVM_EXIT_* union ensures that KVM won't clobber > > + * information for the original exit. > > + */ > > + union { > > /* KVM_EXIT_MEMORY_FAULT */ > > blahblahblah > > +#endif > > }; > > > > +#ifdef __KERNEL__ > > + /* > > + * Non-ABI, kernel-only field that KVM uses to detect bugs related to > > + * filling exit_reason and the exit unions, e.g. to guard against > > + * clobbering a previous exit. > > + */ > > + __u64 exit_canary; > > +#endif > > + > > We can't set exit_reason in the kvm_handle_guest_uaccess_fault() > helper if we're to handle case A (the setup vcpu exit -> fail guest > memory access -> return to userspace) correctly. But then userspace > needs some other way to check whether an efault is annotated, and > might as well check the canary, so something like > > > + __u32 speculative_exit_reason; No need for a full 32-bit value, or even a separate field, we can use kvm_run.flags. Ugh, but of course flags' usage is arch specific. *sigh* We can either defines a flags2 (blech), or grab the upper byte of flags for arch agnostic flags (slightly less blech). Regarding the canary, if we want to use it for WARN_ON_ONCE(), it can't be exposed to userspace. Either that or we need to guard the WARN in some way. > > + union { > > + /* KVM_SPEC_EXIT_MEMORY_FAULT */ Definitely just KVM_EXIT_MEMORY_FAULT, the vast, vast majority of exits to userspace will not be speculative in any way. > > + struct { > > + __u64 flags; > > + __u64 gpa; > > + __u64 len; > > + } memory_fault; > > + /* Fix the size of the union. */ > > + char speculative_padding[256]; > > + }; > > With the condition for an annotated efault being "if kvm_run returns > -EFAULT and speculative_exit_reason is..."