PowerPC External interrupts are not triggered in secondary kernel

Sunil Kumar <sukumar@xxxxxxxxxx> · Tue, 3 Apr 2018 13:57:44 +0530

Hi All,

I came across very interesting issue while working kexec with the 3.10
kernel on NXP's t1042ds reference board.
whenever crashing the primary kernel using the "echo c >
/proc/sysrq-trigger' command. i am getting "mmc0: Timeout waiting for
hardware interrupt." during bootup of secondary crash kernel. The mmc
timeout is comming because the sdhci driver initialiation code sends
the set of commands to the esdhc host controller and expecting it to
get the interrupt from sdcard( i.e esdhc) host controller. When the
interrupt is not raised within 10ms of time, sdhci reports the
interrupt is not recieved from the sdcard controller by throwing mmc
timeout error.

To debug it further added print in do_IRQ routine to check whether
external interrupts are comming or not, but as soon as interrupts are
enabled in the secondary kernel ( inside start_kernel() ),  I am
getting spurious interrupt (i.e. irq=0 ~ NO_IRQ ) and a doorbell
exception.

Not sure why doorbell exception, as SMP support is not enabled in my
secondary kernel ?
If I am not wrong doorbell exception are mainly used for IPI
communication across different CPU cores.

Whenever getting this supurious interrupt during boots, Secondary
kernel is not booting up due to mmc timeout error.
Debugged it further by printing all the irq which are pending in the
primary kernel before jumping to secondary kernel and found that one
of the irq is not processed in the primary kernel. Looks like whenever
having any pending interrupts from the primary kernel, the secondary
kernel treat that as spurious interrupt.

Ideally there should not be any pending interrupts from primary kernel
as we are send EOI to all the interrupts and masking them in the
machine_kexec_mask_interrupts() routine. As per my understanding even
if spurious interrupt comes its should not cause any harm and
secondary kernel should boot ideally.

The problem get fixed when all the pending interrupts are processed in
the primary kernel just before jumping to secondary kernel. Below are
the changes which fixes the issue
==============================================================================

diff --git a/arch/powerpc/kernel/machine_kexec_32.c
b/arch/powerpc/kernel/machine_kexec_32.c
index affe5dc..b2256d7 100644
--- a/arch/powerpc/kernel/machine_kexec_32.c
+++ b/arch/powerpc/kernel/machine_kexec_32.c
@@ -36,6 +36,13 @@ void default_machine_kexec(struct kimage *image)
        unsigned long reboot_code_buffer, reboot_code_buffer_phys;
        relocate_new_kernel_t rnk;

+       /*
+         Allow pending interrupts to execute before jumping to
+         secondary kernel if kexec called from atomic context.
+       */
+       if (irqs_disabled())
+               local_irq_enable();
+
        /* Interrupts aren't acceptable while we reboot */
        local_irq_disable();
==============================================================================

Similiar kind of solution has been merged to linux-stable for CPU
HOTPLUG feature, where before disabling the cpu, kernel process all
pending interrupts before going offline.

Below is the reference
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=687b8f24f14db842c8c3f8cb8b24c9a29b691db8

Is this good idea to process all the pending interrupt before jumping
to secondary kernel ?
Please provide your suggestion.

Regards
Sunil Kumar

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec