On Fri, Feb 21, 2025 at 11:01:09PM +0000, Michael Kelley wrote: > From: Hamza Mahfooz <hamzamahfooz@xxxxxxxxxxxxxxxxxxx> Sent: Friday, February 21, 2025 1:31 PM > > > > Since, the panic handlers may require certain cpus to be online to panic > > gracefully, we should call them before turning off SMP. Without this > > re-ordering, on Hyper-V hv_panic_vmbus_unload() times out, because the > > vmbus channel is bound to VMBUS_CONNECT_CPU and unless the crashing cpu > > is the same as VMBUS_CONNECT_CPU, VMBUS_CONNECT_CPU will be offlined by > > crash_smp_send_stop() before the vmbus channel can be deconstructed. > > Hamza -- what specifically is the problem with the way vmbus_wait_for_unload() > works today? That code is aware of the problem that the unload response comes > only on the VMBUS_CONNECT_CPU, and that cpu may not be able to handle > the interrupt. So the code polls the message page of each CPU to try to get the > unload response message. Is there a scenario where that approach isn't working? > It doesn't work on arm64 (if the crashing cpu isn't VMBUS_CONNECT_CPU), it always ends up at "VMBus UNLOAD did not complete" without fail. It seems like arm64's crash_smp_send_stop() is more aggressive than x86's. > Note also that Hyper-V itself can take a long time (10's of seconds) to respond > to the unload request. See the comments in vmbus_wait_for_unload() about > flushing the Azure host disk cache. I worked on this code and did the > measurements, so I have some familiarity with the problems. :-) > > Michael > > > > > Signed-off-by: Hamza Mahfooz <hamzamahfooz@xxxxxxxxxxxxxxxxxxx> > > --- > > v2: keep printk_legacy_allow_panic_sync() after > > panic_other_cpus_shutdown(). > > --- > > kernel/panic.c | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/panic.c b/kernel/panic.c > > index fbc59b3b64d0..433cf651e213 100644 > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -372,16 +372,16 @@ void panic(const char *fmt, ...) > > if (!_crash_kexec_post_notifiers) > > __crash_kexec(NULL); > > > > - panic_other_cpus_shutdown(_crash_kexec_post_notifiers); > > - > > - printk_legacy_allow_panic_sync(); > > - > > /* > > * Run any panic handlers, including those that might need to > > * add information to the kmsg dump output. > > */ > > atomic_notifier_call_chain(&panic_notifier_list, 0, buf); > > > > + panic_other_cpus_shutdown(_crash_kexec_post_notifiers); > > + > > + printk_legacy_allow_panic_sync(); > > + > > panic_print_sys_info(false); > > > > kmsg_dump_desc(KMSG_DUMP_PANIC, buf); > > -- > > 2.47.1 > > >