On Fri, Jan 31, 2020 at 12:36:22PM -0800, Sean Christopherson wrote: > On Fri, Jan 31, 2020 at 03:28:24PM -0500, Peter Xu wrote: > > On Fri, Jan 31, 2020 at 11:33:01AM -0800, Sean Christopherson wrote: > > > For the same reason we don't take mmap_sem, it gains us nothing, i.e. KVM > > > still has to use copy_{to,from}_user(). > > > > > > In the proposed __x86_set_memory_region() refactor, vmx_set_tss_addr() > > > would be provided the hva of the memory region. Since slots_lock and SRCU > > > only protect gfn->hva, why would KVM take slots_lock since it already has > > > the hva? > > > > OK so you're suggesting to unlock the lock earlier to not cover > > init_rmode_tss() rather than dropping the whole lock... Yes it looks > > good to me. I think that's the major confusion I got. > > Ya. And I missed where the -EEXIST was coming from. I think we're on the > same page. Good to know. Btw, for me I would still prefer to keep the lock be after the __copy_to_user()s because "HVA is valid without lock" is only true for these private memslots. After all this is super slow path so I wouldn't mind to take the lock for some time longer. Or otherwise if you really like the unlock() to be earlier I can comment above the unlock: /* * We can unlock before using the HVA only because this KVM private * memory slot will never change until the end of VM lifecycle. */ > > > > Returning -EEXIST is an ABI change, e.g. userspace can currently call > > > KVM_SET_TSS_ADDR any number of times, it just needs to ensure proper > > > serialization between calls. > > > > > > If you want to change the ABI, then submit a patch to do exactly that. > > > But don't bury an ABI change under the pretense that it's a bug fix. > > > > Could you explain what do you mean by "ABI change"? > > > > I was talking about the original code, not after applying the > > patchset. To be explicit, I mean [a] below: > > > > int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size, > > unsigned long *uaddr) > > { > > int i, r; > > unsigned long hva; > > struct kvm_memslots *slots = kvm_memslots(kvm); > > struct kvm_memory_slot *slot, old; > > > > /* Called with kvm->slots_lock held. */ > > if (WARN_ON(id >= KVM_MEM_SLOTS_NUM)) > > return -EINVAL; > > > > slot = id_to_memslot(slots, id); > > if (size) { > > if (slot->npages) > > return -EEXIST; <------------------------ [a] > > } > > ... > > } > > Doh, I completely forgot that the second __x86_set_memory_region() would > fail. Sorry :-( > > > > > Yes, but as I mentioned, I don't think it's an issue to be considered > > > > by KVM, otherwise we should have the same issue all over the places > > > > when we fetch the cached userspace_addr from any user slots. > > > > > > Huh? Of course it's an issue that needs to be considered by KVM, e.g. > > > kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user() for > > > giggles. > > > > The cache is for the GPA->HVA translation (struct gfn_to_hva_cache), > > we still use __copy_{to,}from_user() upon the HVAs, no? > > I'm still lost on this one. I'm pretty sure I'm incorrectly interpreting: > > I don't think it's an issue to be considered by KVM, otherwise we should > have the same issue all over the places when we fetch the cached > userspace_addr from any user slots. > > What is the issue to which you are referring? The issue I was referring to is "HVA can be unmapped by the userspace without KVM's notice". I think actually we're on the same page too here, my follow-up question is really a pure question for when you say "kvm_{read,write}_guest_cached() aren't using __copy_{to,}from_user()" above - because that's against my understanding. -- Peter Xu