On Fri 2022-06-17 12:52:05, Valentin Schneider wrote: > Hi Tao, > > On 17/06/22 18:42, Tao Zhou wrote: > > Hi Valentin, > > > > On Thu, Jun 16, 2022 at 01:37:09PM +0100, Valentin Schneider wrote: > >> @@ -964,24 +966,31 @@ late_initcall(kexec_core_sysctl_init); > >> */ > >> void __noclone __crash_kexec(struct pt_regs *regs) > >> { > >> - /* Take the kexec_mutex here to prevent sys_kexec_load > >> - * running on one cpu from replacing the crash kernel > >> - * we are using after a panic on a different cpu. > >> + /* > >> + * This should be taking kexec_mutex before doing anything with the > >> + * kexec_crash_image, but this code can be run in NMI context which > >> + * means we can't even trylock. > >> * > >> - * If the crash kernel was not located in a fixed area > >> - * of memory the xchg(&kexec_crash_image) would be > >> - * sufficient. But since I reuse the memory... > >> + * Pairs with smp_mb() in do_kexec_load() and sys_kexec_file_load() > >> */ > >> - if (mutex_trylock(&kexec_mutex)) { > >> - if (kexec_crash_image) { > >> - struct pt_regs fixed_regs; > >> - > >> - crash_setup_regs(&fixed_regs, regs); > >> - crash_save_vmcoreinfo(); > >> - machine_crash_shutdown(&fixed_regs); > >> - machine_kexec(kexec_crash_image); > >> - } > >> - mutex_unlock(&kexec_mutex); > >> + WRITE_ONCE(panic_wants_kexec, true); > >> + smp_mb(); > >> + /* > >> + * If we're panic'ing while someone else is messing with the crash > >> + * kernel, this isn't going to end well. > >> + */ > >> + if (READ_ONCE(kexec_loading)) { > >> + WRITE_ONCE(panic_wants_kexec, false); > >> + return; > >> + } > > > > So this is from NMI. The mutex guarantee that kexec_file_load() or > > do_kexec_load() just one of them beat on cpu. NMI can happen on more > > than one cpu. That means that here be cumulativity here IMHO. > > > > If you look at __crash_kexec() in isolation yes, but if you look at panic() > and nmi_panic() only a single NMI can get in there. I think that is also > true for invocations via crash_kexec(). It is true that panic() could be called only once, see this code in panic(): * Only one CPU is allowed to execute the panic code from here. For this_cpu = raw_smp_processor_id(); old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu); if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) panic_smp_self_stop(); One the other hand, the aproach with two variables is strage and brings exactly these questions. If a trylock is enough that the mutex can be replaced by two simple atomic operations. The mutex would be needed only when a user really would need to wait for another one. atomic_t crash_kexec_lock; /* trylock part */ if (atomic_cmpxchg_acquire(&crash_kexec_lock, 0, 1) != 0) return -EBUSY; /* do anything guarded by crash_kexec_lock */ /* release lock */ atomic_set_release(&crash_kexec_lock, 0); The _acquire, _release variants will do the barriers correctly. Best Regards, Petr