Hello, kernel test robot noticed "WARNING:lock_held_when_returning_to_user_space" on: commit: e85c0d9e725529a5ed68ad0b6abc62b332654156 ("proc: wean of off f_version") https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git work.f_version.wip.v1 in testcase: trinity version: trinity-i386-abe9de86-1_20230429 with following parameters: runtime: 300s group: group-01 nr_groups: 5 compiler: clang-18 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G (please refer to attached dmesg/kmsg for entire log/backtrace) in our tests, the issue doesn't always happen, about 47 times out of 200 runs as below. but the parent keeps clean. 0aa860d0152c07db e85c0d9e725529a5ed68ad0b6ab ---------------- --------------------------- fail:runs %reproduction fail:runs | | | :200 24% 47:200 dmesg.BUG:scheduling_while_atomic :200 24% 47:200 dmesg.WARNING:lock_held_when_returning_to_user_space :200 24% 47:200 dmesg.is_leaving_the_kernel_with_locks_still_held If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202409032134.c262dced-lkp@xxxxxxxxx [ 244.794661][ T3669] WARNING: lock held when returning to user space! [ 244.795232][ T3669] 6.11.0-rc4-00021-ge85c0d9e7255 #1 Not tainted [ 244.795771][ T3669] ------------------------------------------------ [ 244.796336][ T3669] trinity-c0/3669 is leaving the kernel with locks still held! [ 244.796995][ T3669] 1 lock held by trinity-c0/3669: [ 244.797435][ T3669] #0: ec876814 (&f->f_lock){+.+.}-{2:2}, at: generic_versioned_llseek (include/linux/spinlock.h:? fs/read_write.c:195 fs/read_write.c:220) [ 244.798313][ T3669] BUG: scheduling while atomic: trinity-c0/3669/0x00000002 [ 244.798960][ T3669] INFO: lockdep is turned off. [ 244.799396][ T3669] Modules linked in: [ 244.799732][ T3669] Preemption disabled at: [ 244.799734][ T3669] generic_versioned_llseek (include/linux/spinlock.h:? fs/read_write.c:195 fs/read_write.c:220) [ 244.800712][ T3669] CPU: 0 UID: 65534 PID: 3669 Comm: trinity-c0 Not tainted 6.11.0-rc4-00021-ge85c0d9e7255 #1 [ 244.801612][ T3669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 244.802509][ T3669] Call Trace: [ 244.802803][ T3669] dump_stack_lvl (kernel/printk/printk.c:359) [ 244.803230][ T3669] ? generic_versioned_llseek (include/linux/spinlock.h:? fs/read_write.c:195 fs/read_write.c:220) [ 244.803716][ T3669] __schedule_bug (lib/dump_stack.c:128 kernel/sched/core.c:5734) [ 244.804116][ T3669] __schedule (arch/x86/include/asm/preempt.h:33 kernel/sched/core.c:5762 kernel/sched/core.c:6411) [ 244.804500][ T3669] ? tick_program_event (kernel/time/tick-oneshot.c:44) [ 244.804952][ T3669] schedule (kernel/sched/core.c:6607 kernel/sched/core.c:6621) [ 244.805309][ T3669] irqentry_exit_to_user_mode (kernel/entry/common.c:? include/linux/entry-common.h:328 kernel/entry/common.c:231) [ 244.805817][ T3669] ? sysvec_hyperv_stimer0 (arch/x86/kernel/apic/apic.c:1043) [ 244.806280][ T3669] irqentry_exit (kernel/entry/common.c:367) [ 244.806690][ T3669] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) [ 244.807225][ T3669] handle_exception (arch/x86/entry/entry_32.S:1054) [ 244.807678][ T3669] EIP: 0x77fad092 [ 244.808003][ T3669] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 f8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00 All code ======== 0: 00 00 add %al,(%rax) 2: 00 e9 add %ch,%cl 4: 90 nop 5: ff (bad) 6: ff (bad) 7: ff (bad) 8: ff a3 24 00 00 00 jmp *0x24(%rbx) e: 68 30 00 00 00 push $0x30 13: e9 80 ff ff ff jmp 0xffffffffffffff98 18: ff a3 f8 ff ff ff jmp *-0x8(%rbx) 1e: 66 90 xchg %ax,%ax ... 28: cd 80 int $0x80 2a:* c3 ret <-- trapping instruction 2b: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 32: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi 38: 8b 1c 24 mov (%rsp),%ebx 3b: c3 ret 3c: 8d .byte 0x8d 3d: b4 26 mov $0x26,%ah ... Code starting with the faulting instruction =========================================== 0: c3 ret 1: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 8: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi e: 8b 1c 24 mov (%rsp),%ebx 11: c3 ret 12: 8d .byte 0x8d 13: b4 26 mov $0x26,%ah ... [ 244.809763][ T3669] EAX: ffffffea EBX: 00000015 ECX: 0000004c EDX: 20000000 [ 244.810399][ T3669] ESI: fffffff8 EDI: 00000001 EBP: 3d77df35 ESP: 7fa71c78 [ 244.811032][ T3669] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000216 [ 244.811752][ T3669] ? sysvec_hyperv_stimer0 (arch/x86/kernel/apic/apic.c:1043) [ 244.834705][ T225] [main] kernel became tainted! (512/0) Last seed was 2147483651 [ 244.834712][ T225] [ 244.837382][ T225] trinity: Detected kernel tainting. Last seed was 2147483651 [ 244.837388][ T225] [ 244.888320][ T225] [main] exit_reason=7, but 2 children still running. [ 244.888332][ T225] [ 246.959451][ T225] [main] Bailing main loop because kernel became tainted.. [ 246.959463][ T225] [ 247.099951][ T225] [main] Ran 908890 syscalls. Successes: 341513 Failures: 567373 [ 247.099963][ T225] The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240903/202409032134.c262dced-lkp@xxxxxxxxx -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki