> -----Original Message----- > From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] > Sent: Wednesday, November 30, 2016 9:55 AM > To: x86@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>; > Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Thomas Gleixner > <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; H. Peter Anvin > <hpa@xxxxxxxxx> > Subject: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when > unknown_nmi_panic > > There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt) > which > injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs > of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic > is > enabled and we'd like to do kdump we need to perform some minimal > cleanup > so the kdump kernel will be able to initialize VMBus devices, this cleanup > includes sending CHANNELMSG_UNLOAD to the host waiting for > CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the > response > to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on > VMBus module > load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we > can't do > any cross-CPU work reliably on crash we have vmbus_wait_for_unload() > function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs > message > pages and this sometimes works. It was discovered that in case the host > wants to send more than one message to a secondary CPU (not the CPU > running > vmbus_wait_for_unload()) we're unable to get it as after reading the first > message we're supposed to do EOMing by doing > wrmsrl(HV_X64_MSR_EOM, 0) but > this is per-CPU. I have a feeling that this was working some time ago when > I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a > message even without wrmsrl() but apparently this doesn't work any more. > Unfortunately there is not that much we can do when all CPUs get NMI as > all but the first one are getting blocked with interrupts disabled. What we > can do is limit processing unknown interrupts to the first CPU which gets > it in case we're about to crash. > > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Thanks Vitaly. Acked-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx> > --- > arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/arch/x86/kernel/cpu/mshyperv.c > b/arch/x86/kernel/cpu/mshyperv.c > index 8f44c5a..6e4181ff 100644 > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -31,6 +31,7 @@ > #include <asm/apic.h> > #include <asm/timer.h> > #include <asm/reboot.h> > +#include <asm/nmi.h> > > struct ms_hyperv_info ms_hyperv; > EXPORT_SYMBOL_GPL(ms_hyperv); > @@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void) > return 0; > } > > +/* > + * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes > + * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle > + * unknown NMI on the first CPU which gets it. > + */ > +static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) > +{ > + static atomic_t nmi_cpu = ATOMIC_INIT(-1); > + > + if (!unknown_nmi_panic) > + return NMI_DONE; > + > + if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1) > + return NMI_HANDLED; > + > + return NMI_DONE; > +} > + > static void __init ms_hyperv_init_platform(void) > { > /* > @@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void) > */ > if (efi_enabled(EFI_BOOT)) > x86_platform.get_nmi_reason = hv_get_nmi_reason; > + > + register_nmi_handler(NMI_LOCAL, hv_nmi_unknown, > NMI_FLAG_FIRST, > + "hv_nmi_unknown"); > } > > const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = { > -- > 2.9.3 _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel