On Sat, May 30, 2020 at 10:57:24AM -0700, Linus Torvalds wrote: > So no. I disagree. There is absolutely nothing "obviously ok" about > any of that kvm code. Quite the reverse. > > I'd argue that it's very much obviously *NOT* ok, even while it might > just happen to work. Actually, it's somewhat less brittle than you think (on non-mips, at least) and not due to those long-ago access_ok(). > That double underscore needs to go away. It's either actively buggy > right now and I see no proof it isn't, or it's a bug just waiting to > happen in the future. FWIW, the kvm side of things (vhost is yet another pile of fun) is [x86] kvm_hv_set_msr_pw(): arch/x86/kvm/hyperv.c:1027: if (__copy_to_user((void __user *)addr, instructions, 4)) HV_X64_MSR_HYPERCALL arch/x86/kvm/hyperv.c:1132: if (__clear_user((void __user *)addr, sizeof(u32))) HV_X64_MSR_VP_ASSIST_PAGE in both cases addr comes from gfn = data >> HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT; addr = kvm_vcpu_gfn_to_hva(vcpu, gfn); if (kvm_is_error_hva(addr)) return 1; [x86] FNAME(walk_addr_generic), very hot: arch/x86/kvm/mmu/paging_tmpl.h:403: if (unlikely(__get_user(pte, ptep_user))) index = PT_INDEX(addr, walker->level); ... offset = index * sizeof(pt_element_t); ... host_addr = kvm_vcpu_gfn_to_hva_prot(vcpu, real_gfn, &walker->pte_writable[walker->level - 1]); if (unlikely(kvm_is_error_hva(host_addr))) goto error; ptep_user = (pt_element_t __user *)((void *)host_addr + offset); __kvm_read_guest_page(): virt/kvm/kvm_main.c:2252: r = __copy_from_user(data, (void __user *)addr + offset, len); addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) return -EFAULT; __kvm_read_guest_atomic(): virt/kvm/kvm_main.c:2326: r = __copy_from_user_inatomic(data, (void __user *)addr + offset, len); addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) return -EFAULT; __kvm_write_guest_page(): virt/kvm/kvm_main.c:2353: r = __copy_to_user((void __user *)addr + offset, data, len); addr = gfn_to_hva_memslot(memslot, gfn); if (kvm_is_error_hva(addr)) return -EFAULT; kvm_write_guest_offset_cached(): virt/kvm/kvm_main.c:2490: r = __copy_to_user((void __user *)ghc->hva + offset, data, len); if (kvm_is_error_hva(ghc->hva)) return -EFAULT; kvm_read_guest_cached(): virt/kvm/kvm_main.c:2525: r = __copy_from_user(data, (void __user *)ghc->hva, len); if (kvm_is_error_hva(ghc->hva)) return -EFAULT; default kvm_is_error_hva() is addr >= PAGE_OFFSET; however, on mips and s390 it's IS_ERR_VALUE(). Sure, we can use non-__ variants, but is access_ok() the right primitive here? We want userland memory, set_fs() be damned.