Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes: > Vitaly, > > On Thu, 1 Dec 2016, Vitaly Kuznetsov wrote: > >> There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt) which >> injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs >> of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic is >> enabled and we'd like to do kdump we need to perform some minimal cleanup >> so the kdump kernel will be able to initialize VMBus devices, this cleanup >> includes sending CHANNELMSG_UNLOAD to the host waiting for >> CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the response >> to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on VMBus module >> load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we can't do >> any cross-CPU work reliably on crash we have vmbus_wait_for_unload() >> function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs message >> pages and this sometimes works. It was discovered that in case the host >> wants to send more than one message to a secondary CPU (not the CPU running >> vmbus_wait_for_unload()) we're unable to get it as after reading the first >> message we're supposed to do EOMing by doing wrmsrl(HV_X64_MSR_EOM, 0) but >> this is per-CPU. I have a feeling that this was working some time ago when >> I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a >> message even without wrmsrl() but apparently this doesn't work any more. >> Unfortunately there is not that much we can do when all CPUs get NMI as >> all but the first one are getting blocked with interrupts disabled. What we >> can do is limit processing unknown interrupts to the first CPU which gets >> it in case we're about to crash. > > This is completely unreadable and I really tried hard to make sense of it. > > Please structure it in a way that people who are not familiar with the > inner workings of hyperv can at least understand the problem you are trying > to solve and the concept of the solution w/o needing to figure out what all > the acronyms and constants actually mean. > > Also visual structuring in paragraphs helps readability a lot. > Got it, I'll try to do my best to make it readable. > AFAICT this tries to deal with different problems of different hypervisor > versions, but even that is unclear as you talk about version WS2016, > versions prior to WS2016 and then about WS2012R2 in particular. > > Another issue I have with this is: > > ".... I have a feeling that this was working ...." > > Changes like this are not about feelings. We want to have changes based on > facts. > The thing is that Hyper-V is a (proprietary) software which gets updates and I don't remember which particular updates were installed when I was imlementing vmbus_wait_for_unload() but as far as I remember it was always working on WS2012R2. Now I observe a different behavior ... -- Vitaly _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel