On 2021/9/15 上午12:16, Dave Hansen wrote: > On 9/14/21 12:23 AM, 王贇 wrote: >> >> On 2021/9/14 上午11:02, 王贇 wrote: >> [snip] >>> [ 44.133509][ C0] traps: PANIC: double fault, error_code: 0x0 >>> [ 44.133519][ C0] double fault: 0000 [#1] SMP PTI >>> [ 44.133526][ C0] CPU: 0 PID: 743 Comm: a.out Not tainted 5.14.0-next-20210913 #469 >>> [ 44.133532][ C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 >>> [ 44.133536][ C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70 >>> [ 44.133549][ C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e >>> [ 44.133556][ C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046 >> Another information is that I have printed '__this_cpu_ist_bottom_va(NMI)' >> on cpu0, which is just the RSP fffffe000000b000, does this imply >> we got an overflowed NMI stack? > > Yep. I have the feeling some of your sanitizer and other debugging is > eating the stack: Could be, in another thread we have confirmed the exception stack was overflowed. > >> [ 44.134987][ C0] ? __sanitizer_cov_trace_pc+0x7/0x60 >> [ 44.135005][ C0] ? kcov_common_handle+0x30/0x30 > > Just turning off tracing for the page fault handler is papering over the > problem. It'll just come back later with a slightly different form. > Cool~ please let me know when you have the proper approach. Regards, Michael Wang