Hi , Ingo On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote: > * Eric W. Biederman <ebiederm at xmission.com> wrote: > > > Sigh. Can we please just do the work to rip out the apic shutdown code from the > > kexec on panic code path? > > > > I forgetting details but the only reason we have do any apic shutdown is bugs in > > older kernels that could not initialize a system properly if we did not shut > > down the apics. > > > > I certainly don't see an issue with goofy cases like notsc not working on a > > crash capture kernel if we are not initializing the hardware properly. > > > > The strategy really needs to be to only do the absolutely essential hardware > > shutdown in the crashing kernel, every adintional line of code we execute in the > > crashing kernel increases our chances of hitting a bug. > > Fully agreed. > > > Under that policy things like requring we don't pass boot options that inhibit > > the dump catpure kernel from initializing the hardware from a random state are > > reasonable requirements. AKA I don't see any justification in this as to why we > > would even want to support notsc on the dump capture kernel. Especially when > > things clearly work when that option is not specified. > > So at least on the surface it appears 'surprising' that the 'notsc' option (which, > supposedly, disables TSC handling) interferes with being able to fully boot. Even > if 'notsc' is specified we are still using the local APIC, right? In most case, It's no problem that using local APIC while notsc is specified. But not for kdump. We can get evidence, Especially from "Spurious LAPIC timer interrupt on cpu 0". ###serial log, [ 0.000000] NR_IRQS:524544 nr_irqs:256 16 [ 0.000000] Spurious LAPIC timer interrupt on cpu 0 [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [tty0] enabled [ 0.000000] console [ttyS0] enabled [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 2099.947 MHz processor [ 0.000000] Calibrating delay loop... Due to the local apic and local apic timer hasn't been setup and enabled fully, The event_handler of clock event is NULL. ###codes, static void local_apic_timer_interrupt(void) { int cpu = smp_processor_id(); struct clock_event_device *evt = &per_cpu(lapic_events, cpu); /* * Normally we should not be here till LAPIC has been initialized but * in some cases like kdump, its possible that there is a pending LAPIC * timer interrupt from previous kernel's context and is delivered in * new kernel the moment interrupts are enabled. * * Interrupts are enabled early and LAPIC is setup much later, hence * its possible that when we get here evt->event_handler is NULL. * Check for event_handler being NULL and discard the interrupt as * spurious. */ if (!evt->event_handler) { pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); /* Switch it off */ lapic_timer_shutdown(evt); return; } ............. } IMHO, If we specify notsc, the dump-capture kernel waits for jiffies being updated early and LAPIC and timer are setup much later, which causes no timer interrupts is passed to BSP. as following, start_kernel --> 1)-> calibrate_delay() -> calibrate_delay_converge() # hang and wait for jiffies changed 2)-> rest_init() -> kernel_init() -> .... -> apic_bsp_setup() -> setup_local_APIC() -> setup_percpu_clockev(). the setup_percpu_clockev points setup_boot_APIC_clock() which used to setup the boot APIC and timer. > So it might be a good idea to find the root cause of this bootup fragility even if > 'notsc' is specified. And I fully agree that it should be fixed in the bootup path > of the dump kernel, not the crash kernel reboot path. Because the lapic and timer are not ready when dump-capture waits them to update the jiffies value. so I suggest to put APIC in legacy mode in local_apic_timer_interrupt() temporarily, which in the bootup path of dump kernel. diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index dcb52850a28f..af3be93997ed 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -879,6 +879,7 @@ static void local_apic_timer_interrupt(void) pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); /* Switch it off */ lapic_timer_shutdown(evt); + disable_IO_APIC(); return; } And the new solution can fix the problem. What?s your opinion about it? Thanks, wei > > Thanks, > > Ingo > >