On Thu, 2020-06-11 at 17:27 +0200, Paolo Bonzini wrote: > On 11/06/20 16:44, Maxim Levitsky wrote: > > On Mon, 2020-06-01 at 04:21 -0400, Paolo Bonzini wrote: > > > The userspace_addr alignment and range checks are not performed for private > > > memory slots that are prepared by KVM itself. This is unnecessary and makes > > > it questionable to use __*_user functions to access memory later on. We also > > > rely on the userspace address being aligned since we have an entire family > > > of functions to map gfn to pfn. > > > > > > Fortunately skipping the check is completely unnecessary. Only x86 uses > > > private memslots and their userspace_addr is obtained from vm_mmap, > > > therefore it must be below PAGE_OFFSET. In fact, any attempt to pass > > > an address above PAGE_OFFSET would have failed because such an address > > > would return true for kvm_is_error_hva. > > > > > > Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > > I bisected this patch to break a VM on my AMD system (3970X) > > > > The reason it happens, is because I have avic enabled (which uses > > a private KVM memslot), but it is permanently disabled for that VM, > > since I enabled nesting for that VM (+svm) and that triggers the code > > in __x86_set_memory_region to set userspace_addr of the disabled > > memslot to non canonical address (0xdeadull << 48) which is later rejected in __kvm_set_memory_region > > after that patch, and that makes it silently not disable the memslot, which hangs the guest. > > > > The call is from avic_update_access_page, which is called from svm_pre_update_apicv_exec_ctrl > > which discards the return value. > > > > > > I think that the fix for this would be to either make access_ok always return > > true for size==0, or __kvm_set_memory_region should treat size==0 specially > > and skip that check for it. > > Or just set hva to 0. Deletion goes through kvm_delete_memslot so that > dummy hva is not used anywhere. If we really want to poison the hva of > deleted memslots we should not do it specially in > __x86_set_memory_region. I'll send a patch. After checking exactly what access_ok does, I mostly agree with this. There is still an implicit assumption that address 0 is a valid userspace address. It is fair to assume that on x86 though. Best regards, Maxim Levitsky > > Paolo >