On Tue, Dec 31, 2024 at 03:16:25PM +0800, Z qiang wrote: > > > > > > > > Hello, > > > > kernel test robot noticed "BUG:unable_to_handle_page_fault_for_address" on: > > > > commit: 9216c28c6a927fd20f116feed55bba025f18f401 ("srcu: Make SRCU readers use ->srcu_ctrs for counter selection") > > https://github.com/paulmckrcu/linux dev.2024.12.24a > > > > in testcase: rcutorture > > version: > > with following parameters: > > > > runtime: 300s > > test: default > > torture_type: srcu > > > > > > > > config: i386-randconfig-005-20241230 > > compiler: gcc-12 > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > > > +------------------------------------------------+------------+------------+ > > | | 2add2e88ea | 9216c28c6a | > > +------------------------------------------------+------------+------------+ > > | BUG:unable_to_handle_page_fault_for_address | 0 | 6 | > > | Oops | 0 | 6 | > > | EIP:__srcu_read_lock | 0 | 6 | > > | Kernel_panic-not_syncing:Fatal_exception | 0 | 6 | > > +------------------------------------------------+------------+------------+ > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > | Closes: https://lore.kernel.org/oe-lkp/202412311203.ca7bddba-lkp@xxxxxxxxx > > > > Please try the following modifications: > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index e85db7d5b364..7c7304dee645 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1999,6 +1999,7 @@ static int srcu_module_coming(struct module *mod) > for (i = 0; i < mod->num_srcu_structs; i++) { > ssp = *(sspp++); > ssp->sda = alloc_percpu(struct srcu_data); > + ssp->srcu_ctrp = &ssp->sda->srcu_ctrs[0]; This does look quite promising, so thank you for digging into this!!! Looking forward to seeing if it fixes the problem. ;-) Thanx, Paul > if (WARN_ON_ONCE(!ssp->sda)) > return -ENOMEM; > } > > > > Thanks > Zqiang > > > > > [ 168.973150][ T628] BUG: unable to handle page fault for address: 2367a000 > > [ 168.973700][ T628] #PF: supervisor write access in kernel mode > > [ 168.974809][ T628] #PF: error_code(0x0002) - not-present page > > [ 168.975761][ T628] *pde = 00000000 > > [ 168.976236][ T628] Oops: Oops: 0002 [#1] PREEMPT SMP > > [ 168.977052][ T628] CPU: 0 UID: 0 PID: 628 Comm: rcu_torture_wri Tainted: G T 6.13.0-rc2-00067-g9216c28c6a92 #1 > > [ 168.978867][ T628] Tainted: [T]=RANDSTRUCT > > [ 168.979429][ T628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > > [ 168.980862][ T628] EIP: __srcu_read_lock (kernel/rcu/srcutree.c:749) > > [ 168.981213][ T628] Code: 85 ff 74 0c e8 45 59 00 00 83 3b 00 74 02 0f 0b 5b 5e 5f 5d c3 8b 00 f0 83 44 24 fc 00 83 c0 07 83 e0 fc c3 55 89 e5 8b 50 04 <64> ff 02 f0 83 44 24 fc 00 2b 50 08 5d 89 d0 c1 f8 03 c3 55 89 e5 > > All code > > ======== > > 0: 85 ff test %edi,%edi > > 2: 74 0c je 0x10 > > 4: e8 45 59 00 00 call 0x594e > > 9: 83 3b 00 cmpl $0x0,(%rbx) > > c: 74 02 je 0x10 > > e: 0f 0b ud2 > > 10: 5b pop %rbx > > 11: 5e pop %rsi > > 12: 5f pop %rdi > > 13: 5d pop %rbp > > 14: c3 ret > > 15: 8b 00 mov (%rax),%eax > > 17: f0 83 44 24 fc 00 lock addl $0x0,-0x4(%rsp) > > 1d: 83 c0 07 add $0x7,%eax > > 20: 83 e0 fc and $0xfffffffc,%eax > > 23: c3 ret > > 24: 55 push %rbp > > 25: 89 e5 mov %esp,%ebp > > 27: 8b 50 04 mov 0x4(%rax),%edx > > 2a:* 64 ff 02 incl %fs:(%rdx) <-- trapping instruction > > 2d: f0 83 44 24 fc 00 lock addl $0x0,-0x4(%rsp) > > 33: 2b 50 08 sub 0x8(%rax),%edx > > 36: 5d pop %rbp > > 37: 89 d0 mov %edx,%eax > > 39: c1 f8 03 sar $0x3,%eax > > 3c: c3 ret > > 3d: 55 push %rbp > > 3e: 89 e5 mov %esp,%ebp > > > > Code starting with the faulting instruction > > =========================================== > > 0: 64 ff 02 incl %fs:(%rdx) > > 3: f0 83 44 24 fc 00 lock addl $0x0,-0x4(%rsp) > > 9: 2b 50 08 sub 0x8(%rax),%edx > > c: 5d pop %rbp > > d: 89 d0 mov %edx,%eax > > f: c1 f8 03 sar $0x3,%eax > > 12: c3 ret > > 13: 55 push %rbp > > 14: 89 e5 mov %esp,%ebp > > [ 168.982540][ T628] EAX: ef0c8420 EBX: ef0c8420 ECX: e5e1e840 EDX: 00000000 > > [ 168.983022][ T628] ESI: ef0c919c EDI: 00000000 EBP: c75e9ee8 ESP: c75e9ee8 > > [ 168.983503][ T628] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010246 > > [ 168.984024][ T628] CR0: 80050033 CR2: 2367a000 CR3: 075f5000 CR4: 00040690 > > [ 168.984518][ T628] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > > [ 168.985008][ T628] DR6: fffe0ff0 DR7: 00000400 > > [ 168.985329][ T628] Call Trace: > > [ 168.985571][ T628] ? show_regs (arch/x86/kernel/dumpstack.c:479 arch/x86/kernel/dumpstack.c:465) > > [ 168.985877][ T628] ? __die_body (arch/x86/kernel/dumpstack.c:421) > > [ 168.986185][ T628] ? __die (arch/x86/kernel/dumpstack.c:435) > > [ 168.986466][ T628] ? page_fault_oops (arch/x86/mm/fault.c:715) > > [ 168.986811][ T628] ? kernelmode_fixup_or_oops+0x50/0x58 > > [ 168.987273][ T628] ? __bad_area_nosemaphore+0x37/0x1d5 > > [ 168.987726][ T628] ? validate_chain (kernel/locking/lockdep.c:3819 kernel/locking/lockdep.c:3872) > > [ 168.988058][ T628] ? bad_area_nosemaphore (arch/x86/mm/fault.c:835) > > [ 168.988406][ T628] ? do_user_addr_fault (arch/x86/mm/fault.c:1280 (discriminator 1)) > > [ 168.988763][ T628] ? exc_page_fault (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:87 arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) > > [ 168.989110][ T628] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1494) > > [ 168.989472][ T628] ? handle_exception (arch/x86/entry/entry_32.S:1048) > > [ 168.989800][ T628] ? siphash_4u64 (lib/siphash.c:203) > > [ 168.990123][ T628] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1494) > > [ 168.990539][ T628] ? __srcu_read_lock (kernel/rcu/srcutree.c:749) > > [ 168.990858][ T628] ? rcu_torture_barrier_init (kernel/rcu/rcutorture.c:3381) rcutorture > > [ 168.991319][ T628] ? siphash_4u64 (lib/siphash.c:203) > > [ 168.991618][ T628] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1494) > > [ 168.992021][ T628] ? __srcu_read_lock (kernel/rcu/srcutree.c:749) > > [ 168.992340][ T628] srcu_read_lock (include/linux/srcu.h:165 include/linux/srcu.h:257) rcutorture > > [ 168.992735][ T628] srcu_torture_read_lock (kernel/rcu/rcutorture.c:693) rcutorture > > [ 168.993184][ T628] rcu_torture_writer (kernel/rcu/rcutorture.c:1528) rcutorture > > [ 168.993615][ T628] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:87 arch/x86/include/asm/irqflags.h:147 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) > > [ 168.994020][ T628] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:80 (discriminator 13)) > > [ 168.994369][ T628] kthread (kernel/kthread.c:391) > > [ 168.994647][ T628] ? rcu_torture_pipe_update (kernel/rcu/rcutorture.c:1447) rcutorture > > [ 168.995108][ T628] ? list_del_init (include/linux/lockdep.h:248) > > [ 168.995428][ T628] ret_from_fork (arch/x86/kernel/process.c:153) > > [ 168.995735][ T628] ? list_del_init (include/linux/lockdep.h:248) > > [ 168.996053][ T628] ret_from_fork_asm (arch/x86/entry/entry_32.S:737) > > [ 168.996380][ T628] entry_INT80_32 (arch/x86/entry/entry_32.S:942) > > [ 168.996692][ T628] Modules linked in: rcutorture(+) torture intel_rapl_msr intel_rapl_common iosf_mbi crc32c_intel aesni_intel input_leds led_class fuse > > [ 168.997654][ T628] CR2: 000000002367a000 > > [ 168.997945][ T628] ---[ end trace 0000000000000000 ]--- > > > > > > The kernel config and materials to reproduce are available at: > > https://download.01.org/0day-ci/archive/20241231/202412311203.ca7bddba-lkp@xxxxxxxxx > > > > > > > > -- > > 0-DAY CI Kernel Test Service > > https://github.com/intel/lkp-tests/wiki > > > >