INIT on AP going to be offline have a problem. Since psr.mc is cleared when bits in psr are set to SAL_PSR_BITS_TO_SET in ia64_jump_to_sal(), so there is a small window that the cpu can receive INIT even if the cpu enter there via INIT handler. In this window we do restore of registers for SAL, so INIT asserted here will not work properly. It is hard to remove this window by masking INIT (i.e. setting psr.mc) because we have to unmask it later in OS, because we have to use branch instruction (br.ret, not rfi) to return SAL, due to OS_BOOT_RENDEZ to SAL return convention. I suppose this window will not be a real problem on cpu offline if we can educate people not to push INIT button during hotplug operation. However only exception is a race in kdump and INIT. Now kdump returns APs to SAL before processing dump, but the kernel might receive INIT at that point in time. Such INIT might be asserted by kdump itself if an AP doesn't react IPI soon and kdump decided to use INIT to stop the AP. Such panic+INIT or INIT+INIT cases should be rare, but it will be happy if we can retrieve crashdump even in such cases. So it will be better to stop returning APs to SAL by kdump. I confirmed that the kdump sometime hangs by concurrent INITs (another INIT after an INIT), and it doesn't hang after applying this patch. Signed-off-by: Hidetoshi Seto <seto.hidetoshi at jp.fujitsu.com> Cc: Vivek Goyal <vgoyal at redhat.com> Cc: Haren Myneni <hbabu at us.ibm.com> Cc: kexec at lists.infradead.org --- arch/ia64/kernel/crash.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c index 48b69fd..eacedfc 100644 --- a/arch/ia64/kernel/crash.c +++ b/arch/ia64/kernel/crash.c @@ -142,10 +142,6 @@ kdump_cpu_freeze(struct unw_frame_info *info, void *arg) atomic_inc(&kdump_cpu_frozen); kdump_status[cpuid] = 1; mb(); -#ifdef CONFIG_HOTPLUG_CPU - if (cpuid != 0) - ia64_jump_to_sal(&sal_boot_rendez_state[cpuid]); -#endif for (;;) cpu_relax(); } -- 1.6.0