> On Thu, Jun 09, 2022 at 11:33:18AM +0800, Mark-PK Tsai wrote: > > Always clear OS lock before enable debug event. > > > > The OS lock is clear in cpuhp ops in recent kernel, > > but when the debug exception happened before it > > kernel might crash because debug event enable didn't > > take effect when OS lock is hold. > > > > Below is the use case that having this problem: > > > > Register kprobe in console_unlock and kernel will > > panic at secondary_start_kernel on secondary core. > > > > CPU: 1 PID: 0 Comm: swapper/1 Tainted: P > > ... > > pstate: 004001c5 (nzcv dAIF +PAN -UAO) > > pc : do_undefinstr+0x5c/0x60 > > lr : do_undefinstr+0x2c/0x60 > > sp : ffffffc01338bc50 > > pmr_save: 000000f0 > > x29: ffffffc01338bc50 x28: ffffff8115e95a00 T > > x27: ffffffc01258e000 x26: ffffff8115e95a00 > > x25: 00000000ffffffff x24: 0000000000000000 > > x23: 00000000604001c5 x22: ffffffc014015008 > > x21: 000000002232f000 x20: 00000000000000f0 j > > x19: ffffffc01338bc70 x18: ffffffc0132ed040 > > x17: ffffffc01258eb48 x16: 0000000000000403 L& > > x15: 0000000000016480 x14: ffffffc01258e000 i/ > > x13: 0000000000000006 x12: 0000000000006985 > > x11: 00000000d5300000 x10: 0000000000000000 > > x9 : 9f6c79217a8a0400 x8 : 00000000000000c5 > > x7 : 0000000000000000 x6 : ffffffc01338bc08 2T > > x5 : ffffffc01338bc08 x4 : 0000000000000002 > > x3 : 0000000000000000 x2 : 0000000000000004 > > x1 : 0000000000000000 x0 : 0000000000000001 *q > > Call trace: > > do_undefinstr+0x5c/0x60 > > el1_undef+0x10/0xb4 > > 0xffffffc014015008 > > vprintk_func+0x210/0x290 > > printk+0x64/0x90 > > cpuinfo_detect_icache_policy+0x80/0xe0 > > __cpuinfo_store_cpu+0x150/0x160 > > secondary_start_kernel+0x154/0x440 > > > > The root cause is that OS_LSR_EL1.OSLK is reset > > to 1 on a cold reset[1] and the firmware didn't > > unlock it by default. > > So the core didn't go to el1_dbg as expected after > > kernel_enable_single_step and eret. > > Hmm, I thought we didn't use hardware single-step for kprobes after > 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing > instructions out-of-line"). What is triggering this exception? > > Will You're right. Actually this issue happend in 5.4 LTS, and the commit you mentioned can avoid the kernel panic by not using hardware single-step. I think 5.4 LTS should apply this commit. 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line") Cc: stable@xxxxxxxxxxxxxxx And I'm not sure if there is other use case may have problem if the kernel don't clear OS lock in enable_debug_monitors everytime. So should we do this to prevent someone face the similar issue?