Re: [PATCH v3 09/21] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR]

Peter Xu <peterx@xxxxxxxxxx> · Fri, 31 Jan 2020 15:55:50 -0500

On Fri, Jan 31, 2020 at 12:36:22PM -0800, Sean Christopherson wrote:
> On Fri, Jan 31, 2020 at 03:28:24PM -0500, Peter Xu wrote:
> > On Fri, Jan 31, 2020 at 11:33:01AM -0800, Sean Christopherson wrote:
> > > For the same reason we don't take mmap_sem, it gains us nothing, i.e. KVM
> > > still has to use copy_{to,from}_user().
> > > 
> > > In the proposed __x86_set_memory_region() refactor, vmx_set_tss_addr()
> > > would be provided the hva of the memory region.  Since slots_lock and SRCU
> > > only protect gfn->hva, why would KVM take slots_lock since it already has
> > > the hva?
> > 
> > OK so you're suggesting to unlock the lock earlier to not cover
> > init_rmode_tss() rather than dropping the whole lock...  Yes it looks
> > good to me.  I think that's the major confusion I got.
> 
> Ya.  And I missed where the -EEXIST was coming from.  I think we're on the
> same page.

Good to know.  Btw, for me I would still prefer to keep the lock be
after the __copy_to_user()s because "HVA is valid without lock" is
only true for these private memslots.  After all this is super slow
path so I wouldn't mind to take the lock for some time longer.  Or
otherwise if you really like the unlock() to be earlier I can comment
above the unlock:

  /*
   * We can unlock before using the HVA only because this KVM private
   * memory slot will never change until the end of VM lifecycle.
   */

> 
> > > Returning -EEXIST is an ABI change, e.g. userspace can currently call
> > > KVM_SET_TSS_ADDR any number of times, it just needs to ensure proper
> > > serialization between calls.
> > > 
> > > If you want to change the ABI, then submit a patch to do exactly that.
> > > But don't bury an ABI change under the pretense that it's a bug fix.
> > 
> > Could you explain what do you mean by "ABI change"?
> > 
> > I was talking about the original code, not after applying the
> > patchset.  To be explicit, I mean [a] below:
> > 
> > int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size,
> > 			    unsigned long *uaddr)
> > {
> > 	int i, r;
> > 	unsigned long hva;
> > 	struct kvm_memslots *slots = kvm_memslots(kvm);
> > 	struct kvm_memory_slot *slot, old;
> > 
> > 	/* Called with kvm->slots_lock held.  */
> > 	if (WARN_ON(id >= KVM_MEM_SLOTS_NUM))
> > 		return -EINVAL;
> > 
> > 	slot = id_to_memslot(slots, id);
> > 	if (size) {
> > 		if (slot->npages)
> > 			return -EEXIST;  <------------------------ [a]
> >         }
> >         ...
> > }
> 
> Doh, I completely forgot that the second __x86_set_memory_region() would
> fail.  Sorry :-(
> 
> > > > Yes, but as I mentioned, I don't think it's an issue to be considered
> > > > by KVM, otherwise we should have the same issue all over the places
> > > > when we fetch the cached userspace_addr from any user slots.
> > > 
> > > Huh?  Of course it's an issue that needs to be considered by KVM, e.g.
> > > kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user() for
> > > giggles.
> > 
> > The cache is for the GPA->HVA translation (struct gfn_to_hva_cache),
> > we still use __copy_{to,}from_user() upon the HVAs, no?
> 
> I'm still lost on this one.  I'm pretty sure I'm incorrectly interpreting:
>   
>   I don't think it's an issue to be considered by KVM, otherwise we should
>   have the same issue all over the places when we fetch the cached
>   userspace_addr from any user slots.
> 
> What is the issue to which you are referring?

The issue I was referring to is "HVA can be unmapped by the userspace
without KVM's notice".  I think actually we're on the same page too
here, my follow-up question is really a pure question for when you say
"kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user()"
above - because that's against my understanding.

-- 
Peter Xu