Re: [bug report] kernel panic at _find_next_zero_bit in io_uring testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/25/23 1:25 AM, Guangwu Zhang wrote:
> Hello,
> 
> We found this kernel panic issue with upstream kernel 6.3.0-rc3 and
> it's 100% reproduced, let me know if you need more info/testing,
> thanks
> 
> kernel repo : https://github.com/torvalds/linux.git
> reproducer :  run the testing from  git://git.kernel.dk/liburing
> 
> [ 1089.762678] Running test recv-msgall-stream.t:
> [ 1089.922127] Running test recv-multishot.t:
> [ 1090.231772] Running test reg-hint.t:
> [ 1090.282612] general protection fault, probably for non-canonical
> address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 1090.294014] KASAN: null-ptr-deref in range
> [0x0000000000000000-0x0000000000000007]
> [ 1090.301586] CPU: 3 PID: 36765 Comm: reg-hint.t Kdump: loaded Not
> tainted 6.3.0-rc3.kasan+ #1
> [ 1090.310035] Hardware name: Dell Inc. PowerEdge R640/0X45NX, BIOS
> 2.16.1 08/17/2022
> [ 1090.317612] RIP: 0010:_find_next_zero_bit+0x47/0x110
> [ 1090.322603] Code: 55 48 c7 c5 ff ff ff ff 48 d3 e5 48 c1 e9 06 53
> 48 89 fb 4c 8d 2c cd 00 00 00 00 4e 8d 24 2f 4c 89 e6 48 83 ec 10 48
> c1 ee 03 <80> 3c 16 00 0f 85 9a 00 00 00 49 8b 34 24 48 f7 d6 48 21 ee
> 75 4b
> [ 1090.341358] RSP: 0018:ffff88848659fb68 EFLAGS: 00010246
> [ 1090.346601] RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1090.353742] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 1090.360882] RBP: ffffffffffffffff R08: ffff888118b22704 R09: ffff888118b22717
> [ 1090.368024] R10: ffffed10231644e2 R11: 0000000000000001 R12: 0000000000000000
> [ 1090.375167] R13: 0000000000000000 R14: ffff8884c5c1a000 R15: 0000000000000000
> [ 1090.382307] FS:  00007fd9683a6740(0000) GS:ffff88887f680000(0000)
> knlGS:0000000000000000
> [ 1090.390403] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1090.396158] CR2: 0000000000404060 CR3: 0000000486274006 CR4: 00000000007706e0
> [ 1090.403297] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1090.410439] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1090.417580] PKRU: 55555554
> [ 1090.420292] Call Trace:
> [ 1090.422748]  <TASK>
> [ 1090.424862]  __io_fixed_fd_install+0x136/0x1d0
> [ 1090.429328]  io_fixed_fd_install+0x4c/0xc0
> [ 1090.433444]  io_socket+0x282/0x3b0
> [ 1090.436866]  io_issue_sqe+0x153/0xeb0
> [ 1090.440549]  io_submit_sqes+0x41d/0xcd0
> [ 1090.444407]  __do_sys_io_uring_enter+0x4e9/0x830
> [ 1090.449044]  ? __pfx___do_sys_io_uring_enter+0x10/0x10
> [ 1090.454197]  ? __pfx___handle_mm_fault+0x10/0x10
> [ 1090.458833]  ? __pfx_mt_find+0x10/0x10
> [ 1090.462601]  do_syscall_64+0x59/0x90
> [ 1090.466196]  ? handle_mm_fault+0x1a0/0x660
> [ 1090.470311]  ? up_read+0x1c/0xb0
> [ 1090.473560]  ? do_user_addr_fault+0x313/0xeb0
> [ 1090.477935]  ? syscall_exit_work+0x103/0x130
> [ 1090.482227]  ? exc_page_fault+0x57/0xc0
> [ 1090.486084]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 1090.491153] RIP: 0033:0x402f3e
> [ 1090.494221] Code: 41 89 ca 8b ba cc 00 00 00 41 b9 08 00 00 00 b8
> aa 01 00 00 41 83 ca 10 f6 82 d0 00 00 00 01 44 0f 44 d1 45 31 c0 31
> d2 0f 05 <c3> 90 89 30 eb 99 0f 1f 40 00 8b 3f 45 31 c0 83 e7 06 41 0f
> 95 c0
> [ 1090.512975] RSP: 002b:00007ffeb6c4ee98 EFLAGS: 00000246 ORIG_RAX:
> 00000000000001aa
> [ 1090.520556] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000402f3e
> [ 1090.527697] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000003
> [ 1090.534830] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000008
> [ 1090.541965] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffeb6c4f0a8
> [ 1090.549106] R13: 0000000000401860 R14: 0000000000406e08 R15: 00007fd9683e9000
> [ 1090.556253]  </TASK>

What Ming said, but also there's never any need to report a bug caused
by a liburing regression test, since we're the ones that make those
anyway... The only exception is if it's triggering something that
didn't trigger before, eg we've regressed. If you look at the test
case you reported, I just added that a few days ago, and its intent
is very much to trigger this bug that got fixed here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=02a4d923e4400a36d340ea12d8058f69ebf3a383

-- 
Jens Axboe





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux