On Mon, 27 Feb 2023 08:15:26 -0500 Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > >> asm_sysvec_apic_timer_interrupt+0x1a/0x20 > >> RIP: 0010:default_idle+0xf/0x20 > >> Code: 89 07 49 c7 c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 76 ff ff ff cc cc cc cc f3 0f 1e fa eb 07 0f 00 2d e3 8a 34 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 65 > >> RSP: 0018:ffffc9000017fe00 EFLAGS: 00000202 > >> RAX: 0000000000dfbea1 RBX: dffffc0000000000 RCX: ffffffff89b1da9c > >> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 > >> RBP: 0000000000000007 R08: 0000000000000001 R09: ffff888119fb6c23 > >> R10: ffffed10233f6d84 R11: dffffc0000000000 R12: 0000000000000003 > >> R13: ffff888100833900 R14: ffffffff8e112850 R15: 0000000000000000 > >> default_idle_call+0x67/0xa0 > >> do_idle+0x361/0x440 > >> cpu_startup_entry+0x18/0x20 > >> start_secondary+0x256/0x300 > >> secondary_startup_64_no_verify+0xce/0xdb > >> </TASK> > >> Modules linked in: > >> CR2: 0000000000000000 > >> ---[ end trace 0000000000000000 ]--- > >> RIP: 0010:0x0 > >> Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > I have seen this exact signature when the processor tries to execute a function that has a NULL address. That causes IP to goto 0 and the exception. Sounds like something corrupted rcu_head (Just a guess). [ Joel, you need to line wrap your emails ;-) ] This looks like a call_rcu() was called on something that later got freed or reused. That is, the bug is not with RCU but with something using RCU. OR it could be a bug with RCU if the synchronize_rcu() ended before the grace periods have finished. -- Steve