On Tue, Sep 15, 2020 at 01:23:50PM +0200, Vitaly Kuznetsov wrote: > Wei Liu <wei.liu@xxxxxxxxxx> writes: > > > On Tue, Sep 15, 2020 at 01:02:08PM +0200, Vitaly Kuznetsov wrote: > >> Wei Liu <wei.liu@xxxxxxxxxx> writes: > >> > >> > On Tue, Sep 15, 2020 at 12:32:29PM +0200, Vitaly Kuznetsov wrote: > >> >> Wei Liu <wei.liu@xxxxxxxxxx> writes: > >> >> > >> >> > When Linux is running as the root partition, the hypercall page will > >> >> > have already been setup by Hyper-V. Copy the content over to the > >> >> > allocated page. > >> >> > >> >> And we can't setup a new hypercall page by writing something different > >> >> to HV_X64_MSR_HYPERCALL, right? > >> >> > >> > > >> > My understanding is that we can't, but Sunil can maybe correct me. > >> > > >> >> > > >> >> > The suspend, resume and cleanup paths remain untouched because they are > >> >> > not supported in this setup yet. > >> >> > > >> >> > Signed-off-by: Lillian Grassin-Drake <ligrassi@xxxxxxxxxxxxx> > >> >> > Signed-off-by: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx> > >> >> > Signed-off-by: Nuno Das Neves <nudasnev@xxxxxxxxxxxxx> > >> >> > Co-Developed-by: Lillian Grassin-Drake <ligrassi@xxxxxxxxxxxxx> > >> >> > Co-Developed-by: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx> > >> >> > Co-Developed-by: Nuno Das Neves <nudasnev@xxxxxxxxxxxxx> > >> >> > Signed-off-by: Wei Liu <wei.liu@xxxxxxxxxx> > >> >> > --- > >> >> > arch/x86/hyperv/hv_init.c | 26 ++++++++++++++++++++++++-- > >> >> > 1 file changed, 24 insertions(+), 2 deletions(-) > >> >> > > >> >> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > >> >> > index 0eec1ed32023..26233aebc86c 100644 > >> >> > --- a/arch/x86/hyperv/hv_init.c > >> >> > +++ b/arch/x86/hyperv/hv_init.c > >> >> > @@ -25,6 +25,7 @@ > >> >> > #include <linux/cpuhotplug.h> > >> >> > #include <linux/syscore_ops.h> > >> >> > #include <clocksource/hyperv_timer.h> > >> >> > +#include <linux/highmem.h> > >> >> > > >> >> > /* Is Linux running as the root partition? */ > >> >> > bool hv_root_partition; > >> >> > @@ -448,8 +449,29 @@ void __init hyperv_init(void) > >> >> > > >> >> > rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > >> >> > hypercall_msr.enable = 1; > >> >> > - hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg); > >> >> > - wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > >> >> > + > >> >> > + if (hv_root_partition) { > >> >> > + struct page *pg; > >> >> > + void *src, *dst; > >> >> > + > >> >> > + /* > >> >> > + * Order is important here. We must enable the hypercall page > >> >> > + * so it is populated with code, then copy the code to an > >> >> > + * executable page. > >> >> > + */ > >> >> > + wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > >> >> > + > >> >> > + pg = vmalloc_to_page(hv_hypercall_pg); > >> >> > + dst = kmap(pg); > >> >> > + src = memremap(hypercall_msr.guest_physical_address << PAGE_SHIFT, PAGE_SIZE, > >> >> > + MEMREMAP_WB); > >> >> > >> >> memremap() can fail... > >> > > >> > And we don't care here, if it fails, we would rather it panic or oops. > >> > > >> > I was relying on the fact that copying from / to a NULL pointer will > >> > cause the kernel to crash. But of course it wouldn't hurt to explicitly > >> > panic here. > >> > > >> >> > >> >> > + memcpy(dst, src, PAGE_SIZE); > >> >> > + memunmap(src); > >> >> > + kunmap(pg); > >> >> > + } else { > >> >> > + hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg); > >> >> > + wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > >> >> > + } > >> >> > >> >> Why can't we do wrmsrl() for both cases here? > >> >> > >> > > >> > Because the hypercall page has already been set up when Linux is the > >> > root. > >> > >> But you already do wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64) > >> in 'if (hv_root_partition)' case above, that's why I asked. > >> > > > > You mean extracting wrmsrl to this point? The ordering matters. See the > > comment in the root branch -- we have to enable the page before copying > > the content. > > > > What can be done is: > > > > if (!root) { > > /* some stuff */ > > } > > > > wrmsrl(...) > > > > if (root) { > > /* some stuff */ > > } > > > > This is not looking any better than the existing code. > > > > Oh, I missed the comment indeed. So Hypervisor already picked a page for > us, however, it didn't enable it and it's not populated? Seems to be the case. > How can we be > sure that we didn't use it for something else already? It is a page deposited to the root partition and it is not going to be used elsewhere, nor it is visible from the root. This is my understanding. I will let Sunil correct me if I'm wrong. > Maybe we can > still give a different known-to-be-empty page? > That's the thing I said I didn't bother trying earlier. Something to check when I have some spare cycles. Wei. > -- > Vitaly >