On 14/10/2019 11:10, Thomas Gleixner wrote: > On Mon, 14 Oct 2019, Guilherme G. Piccoli wrote: >> Modules linked in: <...> >> CPU: 40 PID: 78274 Comm: qemu-system-x86 Tainted: P W OE > > Tainted: P - Proprietary module loaded ... > > Try again without that module Thanks Thomas, for the prompt response. This is some ScaleIO stuff, I guess it's part of customer setup, and I agree would be better to not have this kind of module loaded. Anyway, the analysis of oops show a quite odd situation that we'd like to at least have a strong clue before saying the scaleio stuff is the culprit. > > Tainted: W - Warning issued before > > Are you sure that that warning is harmless and unrelated? > Sorry I didn't mention that before, the warn is: [5946866.593060] WARNING: CPU: 42 PID: 173056 at /build/linux-lts-xenial-80t3lB/linux-lts-xenial-4.4.0/arch/x86/events/intel/core.c:1868 intel_pmu_handle_irq+0x2d4/0x470() [5946866.593061] perfevents: irq loop stuck! It happened ~700 days before the oops (yeah, the uptime is quite large, about 900 days when the oops happened heh). >> 4.4.0-45-generic #66~14.04.1-Ubuntu > > Does the same problem happen with a not so dead kernel? CR2 handling got > quite some updates/fixes since then. Unfortunately we don't have ways to test that for now, but your comment is quite interesting - we can take a look in the CR2 fixes since v4.4. But what do you think about having a #PF while the instruction pointed in the oops Code section (and the RIP address) is not a memory-related insn? Thanks, Guilherme > > Thanks, > > tglx > >