Hi All, I came across very interesting issue while working kexec with the 3.10 kernel on NXP's t1042ds reference board. whenever crashing the primary kernel using the "echo c > /proc/sysrq-trigger' command. i am getting "mmc0: Timeout waiting for hardware interrupt." during bootup of secondary crash kernel. The mmc timeout is comming because the sdhci driver initialiation code sends the set of commands to the esdhc host controller and expecting it to get the interrupt from sdcard( i.e esdhc) host controller. When the interrupt is not raised within 10ms of time, sdhci reports the interrupt is not recieved from the sdcard controller by throwing mmc timeout error. To debug it further added print in do_IRQ routine to check whether external interrupts are comming or not, but as soon as interrupts are enabled in the secondary kernel ( inside start_kernel() ), I am getting spurious interrupt (i.e. irq=0 ~ NO_IRQ ) and a doorbell exception. Not sure why doorbell exception, as SMP support is not enabled in my secondary kernel ? If I am not wrong doorbell exception are mainly used for IPI communication across different CPU cores. Whenever getting this supurious interrupt during boots, Secondary kernel is not booting up due to mmc timeout error. Debugged it further by printing all the irq which are pending in the primary kernel before jumping to secondary kernel and found that one of the irq is not processed in the primary kernel. Looks like whenever having any pending interrupts from the primary kernel, the secondary kernel treat that as spurious interrupt. Ideally there should not be any pending interrupts from primary kernel as we are send EOI to all the interrupts and masking them in the machine_kexec_mask_interrupts() routine. As per my understanding even if spurious interrupt comes its should not cause any harm and secondary kernel should boot ideally. The problem get fixed when all the pending interrupts are processed in the primary kernel just before jumping to secondary kernel. Below are the changes which fixes the issue ============================================================================== diff --git a/arch/powerpc/kernel/machine_kexec_32.c b/arch/powerpc/kernel/machine_kexec_32.c index affe5dc..b2256d7 100644 --- a/arch/powerpc/kernel/machine_kexec_32.c +++ b/arch/powerpc/kernel/machine_kexec_32.c @@ -36,6 +36,13 @@ void default_machine_kexec(struct kimage *image) unsigned long reboot_code_buffer, reboot_code_buffer_phys; relocate_new_kernel_t rnk; + /* + Allow pending interrupts to execute before jumping to + secondary kernel if kexec called from atomic context. + */ + if (irqs_disabled()) + local_irq_enable(); + /* Interrupts aren't acceptable while we reboot */ local_irq_disable(); ============================================================================== Similiar kind of solution has been merged to linux-stable for CPU HOTPLUG feature, where before disabling the cpu, kernel process all pending interrupts before going offline. Below is the reference https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=687b8f24f14db842c8c3f8cb8b24c9a29b691db8 Is this good idea to process all the pending interrupt before jumping to secondary kernel ? Please provide your suggestion. Regards Sunil Kumar _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec