On Tue, Jun 16, 2020 at 12:23:50PM +0200, Christoph Hellwig wrote: > On Tue, Jun 16, 2020 at 12:18:07PM +0200, Peter Zijlstra wrote: > > > It does. But it also means every other user of PAGE_KERNEL_EXEC > > > should trigger this, of which there are a few (kexec, tboot, hibernate, > > > early xen pv mapping, early SEV identity mapping) > > > > There are only 3 users in the entire tree afaict: > > > > arch/arm64/kernel/probes/kprobes.c: page = vmalloc_exec(PAGE_SIZE); > > arch/x86/hyperv/hv_init.c: hv_hypercall_pg = vmalloc_exec(PAGE_SIZE); > > kernel/module.c: return vmalloc_exec(size); > > > > And that last one is a weak function that any arch that has STRICT_RWX > > ought to override. > > > > > We really shouldn't create mappings like this by default. Either we > > > need to flip PAGE_KERNEL_EXEC itself based on the needs of the above > > > users, or add another define to overload vmalloc_exec as there is no > > > other user of that for x86. > > > > We really should get rid of the two !module users of this though; both > > x86 and arm64 have STRICT_RWX and sufficient primitives to DTRT. > > > > What is HV even trying to do with that page? AFAICT it never actually > > writes to it, it seens to give the physica address to an MSR (which I > > suspect then writes crud into the page for us from host context). > > > > Suggesting the page really only needs to be RX. > > > > On top of that, vmalloc_exec() gets us a page from the entire vmalloc > > range, which can be outside of the 2G executable range, which seems to > > suggest vmalloc_exec() is wrong too and all this works by accident. > > > > How about something like this: > > > > > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > > index a54c6a401581..82a3a4a9481f 100644 > > --- a/arch/x86/hyperv/hv_init.c > > +++ b/arch/x86/hyperv/hv_init.c > > @@ -375,12 +375,15 @@ void __init hyperv_init(void) > > guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0); > > wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id); > > > > - hv_hypercall_pg = vmalloc_exec(PAGE_SIZE); > > + hv_hypercall_pg = module_alloc(PAGE_SIZE); > > if (hv_hypercall_pg == NULL) { > > wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); > > goto remove_cpuhp_state; > > } > > > > + set_memory_ro((unsigned long)hv_hypercall_pg, 1); > > + set_memory_x((unsigned long)hv_hypercall_pg, 1); > > The changing of the permissions sucks. I thought about adding > a module_alloc_prot with an explicit pgprot_t argument. On x86 > alone at least ftrace would also benefit from that. The above is also missing a set_vm_flush_reset_perms.