> From: Michael Kelley <mikelley@xxxxxxxxxxxxx> > Sent: Sunday, October 27, 2019 7:10 PM > ... > Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> Should we add the 2 lines: Cc: stable@xxxxxxxxxxxxxxx Fixes: fd1fea6834d0 ("clocksource/drivers: Make Hyper-V clocksource ISA agnostic") fd1fea6834d0() removes the clockevents_unbind_device() call and this patch adds it back. > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -323,8 +323,15 @@ void __init hyperv_init(void) > > x86_init.pci.arch_init = hv_pci_init; > > + if (hv_stimer_alloc()) > + goto remove_hypercall_page; > + The error handling is imperfect here: when I force hv_stimer_alloc() to return -ENOMEM, I get a panic in hv_apic_eoi_write(). It looks it is because we have cleared the pointer 'hv_vp_assist_page' to NULL, but hv_apic_eoi_write() is still in-use. In case hv_stimer_alloc() fails, can we set 'direct_mode_enabled' to false and go on with the legacy Hyper-V timer or LAPIC timer? If not, maybe we can use a BUG_ON() to explicitly panic? > return; > > +remove_hypercall_page: > + hypercall_msr.as_uint64 = 0; > + wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > + hv_hypercall_pg = NULL; > remove_cpuhp_state: > cpuhp_remove_state(cpuhp); > free_vp_assist_page: > -void hv_stimer_cleanup(unsigned int cpu) > +static int hv_stimer_cleanup(unsigned int cpu) > { > struct clock_event_device *ce; > > /* Turn off clockevent device */ > if (ms_hyperv.features & HV_MSR_SYNTIMER_AVAILABLE) { > ce = per_cpu_ptr(hv_clock_event, cpu); > + > + /* > + * In the legacy case where Direct Mode is not enabled > + * (which can only be on x86/64), stimer cleanup happens > + * relatively early in the CPU offlining process. We > + * must unbind the stimer-based clockevent device so > + * that the LAPIC timer can take over until clockevents > + * are no longer needed in the offlining process. The > + * unbind should not be done when Direct Mode is enabled > + * because we may be on an architecture where there are > + * no other clockevents devices to fallback to. > + */ > + if (!direct_mode_enabled) > + clockevents_unbind_device(ce, cpu); > hv_ce_shutdown(ce); In the legacy stimer0 mode, IMO this hv_ce_shutdown() is unnecessary, because "clockevents_unbind_device(ce, cpu)" automatically calls ce->set_state_shutdown(), if ce is active: clockevents_unbind __clockevents_unbind clockevents_replace tick_install_replacement clockevents_exchange_device clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED) __clockevents_switch_state And, in both modes (legacy mode and direct mode), it looks incorrect to call hv_ce_shutdown() if the current processid id != 'cpu', because hv_ce_shutdown() -> hv_init_timer() can only access the current CPU's MSR. Maybe we should use an IPI to run hv_ce_shutdown() on the target CPU in direct mode? > -int hv_stimer_alloc(int sint) > +int hv_stimer_alloc(void) > ... > + ret = cpuhp_setup_state(CPUHP_AP_HYPERV_TIMER_STARTING, > + "clockevents/hyperv/stimer:starting", > + hv_stimer_init, hv_stimer_cleanup); > + if (ret < 0) > + goto free_stimer0_irq; > + stimer0_cpuhp = ret; > } > + return ret; stimer0_cpuhp is 0 when the call is successful, so IMO the logic in hv_stimer_free() is incorrect. Please see below. > void hv_stimer_free(void) > { > - if (direct_mode_enabled && (stimer0_irq != 0)) { > - hv_remove_stimer0_irq(stimer0_irq); > - stimer0_irq = 0; > + if (direct_mode_enabled) { > + if (stimer0_cpuhp) { > + cpuhp_remove_state(stimer0_cpuhp); > + stimer0_cpuhp = 0; > + } > + if (stimer0_irq) { > + hv_remove_stimer0_irq(stimer0_irq); > + stimer0_irq = 0; > + } > } IMO this should be if (direct_mode_enabled) { if (stimer0_cpuhp == 0) cpuhp_remove_state(CPUHP_AP_HYPERV_TIMER_STARTING); if (stimer0_irq) { hv_remove_stimer0_irq(stimer0_irq); stimer0_irq = 0; } } BTW, the default value of 'stimer0_cpuhp' is 0, which means success. Should we change the default value to a non-zero value, e.g. -1 ? > free_percpu(hv_clock_event); > hv_clock_event = NULL; > @@ -190,14 +274,11 @@ void hv_stimer_free(void) > void hv_stimer_global_cleanup(void) > { > int cpu; > - struct clock_event_device *ce; > > - if (ms_hyperv.features & HV_MSR_SYNTIMER_AVAILABLE) { > - for_each_present_cpu(cpu) { > - ce = per_cpu_ptr(hv_clock_event, cpu); > - clockevents_unbind_device(ce, cpu); > - } > + for_each_present_cpu(cpu) { > + hv_stimer_cleanup(cpu); hv_stimer_cleanup() -> hv_ce_shutdown() -> hv_init_timer() can not access a remote CPU's MSR. > @@ -2310,7 +2305,6 @@ static void hv_crash_handler(struct pt_regs *regs) > */ > vmbus_connection.conn_state = DISCONNECTED; > cpu = smp_processor_id(); > - hv_stimer_cleanup(cpu); > hv_synic_cleanup(cpu); > hyperv_cleanup(); Why should we remove the line hv_stimer_cleanup() in the crash handler? In the crash handler, IMO we'd better disable the timer before we proceed. Thanks, -- Dexuan