On Thu, Feb 16, 2023 at 10:53 AM Anish Moorthy <amoorthy@xxxxxxxxxx> wrote: > > On Wed, Feb 15, 2023 at 12:59 AM Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > > > How is userspace expected to differentiate the gup_fast() failed exit > > from the guest-private memory exit? I don't think flags are a good idea > > for this, as it comes with the illusion that both events can happen on a > > single exit. In reality, these are mutually exclusive. > > > > A fault type/code would be better here, with the option to add flags at > > a later date that could be used to further describe the exit (if > > needed). > > Agreed. Something like this, then? > > + struct { > + __u32 fault_code; > + __u64 reserved; > + __u64 gpa; > + __u64 size; > + } memory_fault; > > The "reserved" field is meant to be the placeholder for a future "flags" field. > Let me know if there's a better/more conventional way to achieve this. On Thu, Feb 16, 2023 at 10:53 AM Anish Moorthy <amoorthy@xxxxxxxxxx> wrote: > > 1. As Oliver touches on earlier, we'll probably want to use this same field for > different classes of memory fault in the future (such as the ones which Chao > is introducing in [1]): so it does make sense to add "code" and "flags" > fields which can be used to communicate more information to the user (and > which can just be set to MEM_FAULT_NOWAIT/0 in this series). Let me walk back my responses here: I took a closer look at Chao's series, and it doesn't seem that I should be trying to share KVM_EXIT_MEMORY_FAULT with it in the first place. As far as I can understand (not that much, to be clear :) we're signaling unrelated things, so it makes more sense to use different exits (ie, rename mine to KVM_EXIT_MEMORY_FAULT_NOWAIT). That would prevent any potential confusion about mutual exclusivity. That also removes the need for a "fault_code" field in "memory_fault", which I could rename to something more general like "guest_memory_range". As for the "reserved" field, we could keep it around if we care about reusing "guest_memory_range" between exits, or discard it if we don't. On Thu, Feb 16, 2023 at 1:38 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Feb 16, 2023, Anish Moorthy wrote: > > On Wed, Feb 15, 2023 at 12:59 AM Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > > > > index 109b18e2789c4..9352e7f8480fb 100644 > > > > > --- a/include/linux/kvm_host.h > > > > > +++ b/include/linux/kvm_host.h > > > > > @@ -801,6 +801,9 @@ struct kvm { > > > > > bool vm_bugged; > > > > > bool vm_dead; > > > > > > > > > > + rwlock_t mem_fault_nowait_lock; > > > > > + bool mem_fault_nowait; > > > > > > > > A full-fat rwlock to protect a single bool? What benefits do you > > > > expect from a rwlock? Why is it preferable to an atomic access, or a > > > > simple bitop? > > > > > > There's no need to have any kind off dedicated atomicity. The only readers are > > > in vCPU context, just disallow KVM_CAP_MEM_FAULT_NOWAIT after vCPUs are created. > > > > I think we do need atomicity here. > > Atomicity, yes. Mutually exclusivity, no. AFAICT, nothing will break if userspace > has multiple in-flight calls to toggled the flag. And if we do want to guarantee > there's only one writer, then kvm->lock or kvm->slots_lock will suffice. > > > Since we want to support this without having to pause vCPUs, there's an > > atomicity requirement. > > Ensuring that vCPUs "see" the new value and not corrupting memory are two very > different things. Making the flag an atomic, wrapping with a rwlock, etc... do > nothing to ensure vCPUs observe the new value. And for non-crazy usage of bools, > they're not even necessary to avoid memory corruption... Oh, that's news to me- I've learned to treat any unprotected concurrent accesses to memory as undefined behavior: guess there's always more to learn. > I think what you really want to achieve is that vCPUs observe the NOWAIT flag > before KVM returns to userspace. There are a variety of ways to make that happen, > but since this all about accessing guest memory, the simplest is likely to > "protect" the flag with kvm->srcu, i.e. require SRCU be held by readers and then > do a synchronize_srcu() to ensure all vCPUs have picked up the new value. > > Speaking of SRCU (which protect memslots), why not make this a memslot flag? If > the goal is to be able to turn the behavior on/off dynamically, wouldn't it be > beneficial to turn off the NOWAIT behavior when a memslot is fully transfered? Thanks for the suggestions, I never considered making this a memslot flag actually. so I'll go look into that.