On 21/09/2017 17:00, Alex Williamson wrote: >> >> We hit a hard lockup bug in our tests while started VM with passthrough NIC. >> Unfortunately, we didn't get the available vmcore to dig into. And it is quite difficult >> To reproduce, actually, we only hit this problem once. >> Does anyone hit such problem ? Or any idea ? >> (For the complete log, please see the attachment). > This is not an upstream or a RHEL kernel, so I don't have the sources > to do much analysis. Based on the kernel version number, I'm guessing > this is some derivative of a RHEL-7.2 kernel. Can it be reproduced on > upstream (this is an upstream list)? The ghes functions in the dump > might indicate might indicate a hardware error was triggered and the > firmware logs might provide more indication of the problem. Thanks, I don't recall anything with passthrough, but it looks like this was caused by perf running at the same time as KVM (perf_event_overflow in the second backtrace, "cd 02" aka NMI in the first backtrace's code hexdump). There have been some perf bugs related to nested in RHEL7.2 and RHEL7.3 (start L1, do "perf top" in another terminal, boom when you start L2) and they seem to be fixed in RHEL7.4. However, we never tried bisecting for the fix. Thanks, Paolo