On Wed, Oct 16, 2024 at 04:50:56PM +0800, kernel test robot wrote: > > > Hello, > > kernel test robot noticed "BUG:unable_to_handle_page_fault_for_address" on: Thanks, see below for analysis. > > commit: e65dbb5c9051a4da2305787fd558e1d60de2275a ("[PATCH v2 1/3] pidfd: extend pidfd_get_pid() and de-duplicate pid lookup") > url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/pidfd-extend-pidfd_get_pid-and-de-duplicate-pid-lookup/20241011-191241 > base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next > patch link: https://lore.kernel.org/all/8e7edaf2f648fb01a71def749f17f76c0502dee1.1728643714.git.lorenzo.stoakes@xxxxxxxxxx/ > patch subject: [PATCH v2 1/3] pidfd: extend pidfd_get_pid() and de-duplicate pid lookup > > in testcase: trinity > version: trinity-i386-abe9de86-1_20230429 > with following parameters: > > runtime: 600s > > > > config: x86_64-randconfig-072-20241015 > compiler: gcc-12 > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > | Closes: https://lore.kernel.org/oe-lkp/202410161634.abca3854-lkp@xxxxxxxxx > > > [ 416.054386][ T1959] BUG: unable to handle page fault for address: ffffffff8fed9474 > [ 416.055651][ T1959] #PF: supervisor write access in kernel mode > [ 416.056550][ T1959] #PF: error_code(0x0003) - permissions violation > [ 416.057502][ T1959] PGD 3e90f5067 P4D 3e90f5067 PUD 3e90f6063 PMD 3e50001a1 > [ 416.058587][ T1959] Oops: Oops: 0003 [#1] PREEMPT SMP KASAN > [ 416.059414][ T1959] CPU: 1 UID: 65534 PID: 1959 Comm: trinity-c3 Not tainted 6.12.0-rc1-00004-ge65dbb5c9051 #1 d7a38916ac9252f968706afc2c77f70fbdabe689 > [ 416.061328][ T1959] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > [ 416.062850][ T1959] RIP: 0010:fput (arch/x86/include/asm/atomic64_64.h:61 include/linux/atomic/atomic-arch-fallback.h:4404 include/linux/atomic/atomic-long.h:1571 include/linux/atomic/atomic-instrumented.h:4540 fs/file_table.c:482) > [ 416.063578][ T1959] Code: ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 48 89 e5 41 55 41 54 53 48 89 fb be 08 00 00 00 e8 96 c6 f7 ff <f0> 48 ff 0b 0f 85 dd 00 00 00 65 4c 8b 25 04 ff 0e 70 4c 8d 6b 48 > All code > ======== > 0: ff (bad) > 1: ff 66 66 jmp *0x66(%rsi) > 4: 2e 0f 1f 84 00 00 00 cs nopl 0x0(%rax,%rax,1) > b: 00 00 > d: 0f 1f 00 nopl (%rax) > 10: f3 0f 1e fa endbr64 > 14: 55 push %rbp > 15: 48 89 e5 mov %rsp,%rbp > 18: 41 55 push %r13 > 1a: 41 54 push %r12 > 1c: 53 push %rbx > 1d: 48 89 fb mov %rdi,%rbx > 20: be 08 00 00 00 mov $0x8,%esi > 25: e8 96 c6 f7 ff call 0xfffffffffff7c6c0 > 2a:* f0 48 ff 0b lock decq (%rbx) <-- trapping instruction OK so this looks like the fput() invoking atomic_long_dec_and_test() on an invalid &file->f_count. It looks like 0xffffffff8fed9474 in RBX is the file... And that's because I'm not setting f in SYSCALL_DEFINE4(pidfd_send_signal, ...) at: pidfd_to_pid_proc(pidfd, &f_flags, &f); On error and yet then jump to err: fdput(f); return ret; Which is trying to fdput() (thus fput()) the f, ugh. OK I will fix this + respin, thanks for the report! [snip]