From: Stanislav Kinsburskii <skinsburskii@xxxxxxxxxxxxxxxxxxx> Sent: Thursday, November 3, 2022 10:01 AM > > O Wed, Nov 02, 2022 at 11:36:45PM +0000, Michael Kelley (LINUX) wrote: > > From: Stanislav Kinsburskii <skinsburskii@xxxxxxxxxxxxxxxxxxx> Sent: Wednesday, > November 2, 2022 3:08 PM > > > > > > Microsoft Hypervisor root partition has to map the TSC page specified > > > by the hypervisor, instead of providing the page to the hypervisor like > > > it's done in the guest partitions. > > > > > > However, it's too early to map the page when the clock is initialized, so, the > > > actual mapping is happening later. > > > > > > Signed-off-by: Stanislav Kinsburskiy <stanislav.kinsburskiy@xxxxxxxxx> > > > CC: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> > > > CC: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > > CC: Wei Liu <wei.liu@xxxxxxxxxx> > > > CC: Dexuan Cui <decui@xxxxxxxxxxxxx> > > > CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > > CC: Ingo Molnar <mingo@xxxxxxxxxx> > > > CC: Borislav Petkov <bp@xxxxxxxxx> > > > CC: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > > CC: x86@xxxxxxxxxx > > > CC: "H. Peter Anvin" <hpa@xxxxxxxxx> > > > CC: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > > > CC: linux-hyperv@xxxxxxxxxxxxxxx > > > CC: linux-kernel@xxxxxxxxxxxxxxx > > > --- > > > arch/x86/hyperv/hv_init.c | 2 ++ > > > drivers/clocksource/hyperv_timer.c | 37 +++++++++++++++++++++++++++------ > --- > > > include/clocksource/hyperv_timer.h | 1 + > > > 3 files changed, 31 insertions(+), 9 deletions(-) > > > > > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > > > index f49bc3ec76e6..89954490af93 100644 > > > --- a/arch/x86/hyperv/hv_init.c > > > +++ b/arch/x86/hyperv/hv_init.c > > > @@ -464,6 +464,8 @@ void __init hyperv_init(void) > > > BUG_ON(!src); > > > memcpy_to_page(pg, 0, src, HV_HYP_PAGE_SIZE); > > > memunmap(src); > > > + > > > + hv_remap_tsc_clocksource(); > > > } else { > > > hypercall_msr.guest_physical_address = > > > vmalloc_to_pfn(hv_hypercall_pg); > > > wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > > > diff --git a/drivers/clocksource/hyperv_timer.c > b/drivers/clocksource/hyperv_timer.c > > > index 635c14c1e3bf..ec76303b2a76 100644 > > > --- a/drivers/clocksource/hyperv_timer.c > > > +++ b/drivers/clocksource/hyperv_timer.c > > > @@ -508,9 +508,6 @@ static bool __init hv_init_tsc_clocksource(void) > > > if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) > > > return false; > > > > > > - if (hv_root_partition) > > > - return false; > > > - > > > /* > > > * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly > > > * handles frequency and offset changes due to live migration, > > > @@ -528,16 +525,22 @@ static bool __init hv_init_tsc_clocksource(void) > > > } > > > > > > hv_read_reference_counter = read_hv_clock_tsc; > > > - tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page)); > > > > > > /* > > > - * The Hyper-V TLFS specifies to preserve the value of reserved > > > - * bits in registers. So read the existing value, preserve the > > > - * low order 12 bits, and add in the guest physical address > > > - * (which already has at least the low 12 bits set to zero since > > > - * it is page aligned). Also set the "enable" bit, which is bit 0. > > > + * TSC page mapping works differently in root and guest partitions. > > > + * - In guest partition the guest PFN has to be passed to the > > > + * hypervisor. > > > + * - In root partition it's other way around: it has to map the PFN > > > + * provided by the hypervisor. > > > + * But it can't be mapped right here as it's too early and MMU isn't > > > + * ready yet. So, we only set the enable bit here and will remap the > > > + * page later in hv_remap_tsc_clocksource(). > > > */ > > > tsc_msr.as_uint64 = hv_get_register(HV_REGISTER_REFERENCE_TSC); > > > + if (hv_root_partition) > > > + tsc_pfn = tsc_msr.pfn; > > > + else > > > + tsc_pfn = __phys_to_pfn(virt_to_phys(tsc_page)); > > > > Same problem here with setting tsc_pfn to a guest PFN, which may be > > Different from what Hyper-V is expecting as a PFN two lines below. I know > > the above line was just carried over from Anirudh's previous patch set, > > but I was thinking you would fix this issue. :-) > > > > Fair call. I guess Anirudh has addressed it himself, so I'm going to > rebase on his fix. Sounds good. > > > > tsc_msr.enable = 1; > > > tsc_msr.pfn = tsc_pfn; > > > hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr.as_uint64); > > > @@ -572,3 +575,19 @@ void __init hv_init_clocksource(void) > > > hv_sched_clock_offset = hv_read_reference_counter(); > > > hv_setup_sched_clock(read_hv_sched_clock_msr); > > > } > > > + > > > +void __init hv_remap_tsc_clocksource(void) > > > +{ > > > + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) > > > + return; > > > + > > > + if (!hv_root_partition) { > > > + WARN(1, "%s: attempt to remap TSC page in guest partition\n", > > > + __func__); > > > + return; > > > + } > > > + > > > + tsc_page = memremap(__pfn_to_phys(tsc_pfn), sizeof(tsc_pg), MEMREMAP_WB); > > > > Note that use of __pfn_to_phys() is at risk of being wrong depending on whether > > you decide to keep a guest PFN or a Hyper-V PFN in tsc_pfn. > > > > It's Hyperv-V PFN that is stored in the variable (to match the MSR value for the root > partition). I guess this approach will workd regardless of the guest page size. I agree that it's best for tsc_pfn to contain a Hyper-V PFN. But if that's the case, the above use of __pfn_to_phys() won't work since it is expecting a guest PFN as input. Michael