[bug report] KASAN slab-use-after-free at blktests srp/002 with siw driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As I noted in another thread [1], KASAN slab-use-after-free is observed when
I repeat the blktests test case srp/002 with the siw driver [2]. The kernel
version was v6.10-rc2. The failure is recreated in stable manner when the test
case is repeated around 30 times. It was not observed with the rxe driver.

I think this failure is same as that I reported in Jun/2023 [3]. The Call Trace
reported is quite similar. Also, I confirmed that the trial fix patch that I
created in Jun/2023 avoided the KASAN failure at srp/002.

In Jun/2023, the KASAN failure was observed with the test cases nvme/030 and
nvme/031. But the symptom disappeared in Sep/2023 [4]. I guess the failure has
got observable again with srp/002.

As for the root cause, it was advised that "There is something wrong with the
iwarp cm if it is destroying IDs in handlers" [5]. Actions for fix will be
appreciated. I'm willing to test fix patches.

[1] https://lore.kernel.org/linux-block/n2adhqzr6x5fss6jff7pxhubkkalvxeyesmg7jre4uomfcdudb@dwn3wgkqhmj7/

[2]

Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000006d1c31fe with status 5
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000916ce050 with status 5
Jun 04 09:23:11 testnode2 kernel: ==================================================================
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001770ef1b with status 5
Jun 04 09:23:11 testnode2 kernel: BUG: KASAN: slab-use-after-free in __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel: Read of size 8 at addr ffff888131a3e418 by task kworker/u16:6/1345
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: CPU: 1 PID: 1345 Comm: kworker/u16:6 Not tainted 6.10.0-rc2+ #288
Jun 04 09:23:11 testnode2 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
Jun 04 09:23:11 testnode2 kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm]
Jun 04 09:23:11 testnode2 kernel: Call Trace:
Jun 04 09:23:11 testnode2 kernel:  <TASK>
Jun 04 09:23:11 testnode2 kernel:  dump_stack_lvl+0x6a/0x90
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000f727e5c2 with status 5
Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel:  print_report+0x174/0x505
Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel:  ? __virt_addr_valid+0x1b9/0x400
Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel:  kasan_report+0xa7/0x180
Jun 04 09:23:11 testnode2 kernel:  ? __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel:  __mutex_lock+0x1110/0x13c0
Jun 04 09:23:11 testnode2 kernel:  ? cma_iw_handler+0xac/0x500 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel:  ? __lock_acquire+0x139d/0x5d60
Jun 04 09:23:11 testnode2 kernel:  ? __pfx___mutex_lock+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  ? mark_lock+0xf5/0x1580
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_mark_lock+0x10/0x10
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000009bc71497 with status 5
Jun 04 09:23:11 testnode2 kernel:  ? cma_iw_handler+0xac/0x500 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 0000000041c0fa4b with status 5
Jun 04 09:23:11 testnode2 kernel:  cma_iw_handler+0xac/0x500 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_cma_iw_handler+0x10/0x10 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel:  ? mark_held_locks+0x94/0xe0
Jun 04 09:23:11 testnode2 kernel:  ? _raw_spin_unlock_irqrestore+0x4c/0x60
Jun 04 09:23:11 testnode2 kernel:  cm_work_handler+0xb54/0x1c50 [iw_cm]
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_cm_work_handler+0x10/0x10 [iw_cm]
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_lock_release+0x10/0x10
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 00000000f48094cb with status 5
Jun 04 09:23:11 testnode2 kernel:  process_one_work+0x865/0x1410
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001c3faa8a with status 5
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_lock_acquire+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_process_one_work+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  ? assign_work+0x16c/0x240
Jun 04 09:23:11 testnode2 kernel:  ? lock_is_held_type+0xd5/0x130
Jun 04 09:23:11 testnode2 kernel:  worker_thread+0x5e2/0x1010
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_worker_thread+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  kthread+0x2d1/0x3a0
Jun 04 09:23:11 testnode2 kernel:  ? _raw_spin_unlock_irq+0x24/0x50
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_kthread+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  ret_from_fork+0x30/0x70
Jun 04 09:23:11 testnode2 kernel:  ? __pfx_kthread+0x10/0x10
Jun 04 09:23:11 testnode2 kernel:  ret_from_fork_asm+0x1a/0x30
Jun 04 09:23:11 testnode2 kernel:  </TASK>
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: Allocated by task 75327:
Jun 04 09:23:11 testnode2 kernel:  kasan_save_stack+0x2c/0x50
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000001bd9ea09 with status 5
Jun 04 09:23:11 testnode2 kernel:  kasan_save_track+0x10/0x30
Jun 04 09:23:11 testnode2 kernel:  __kasan_kmalloc+0xa6/0xb0
Jun 04 09:23:11 testnode2 kernel:  __rdma_create_id+0x5b/0x5d0 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel:  __rdma_create_kernel_id+0x12/0x40 [rdma_cm]
Jun 04 09:23:11 testnode2 kernel:  srp_new_rdma_cm_id+0x7c/0x200 [ib_srp]
Jun 04 09:23:11 testnode2 kernel:  add_target_store+0x135e/0x29f0 [ib_srp]
Jun 04 09:23:11 testnode2 kernel: ib_srpt receiving failed for ioctx 000000005afc8065 with status 5
Jun 04 09:23:11 testnode2 kernel:  kernfs_fop_write_iter+0x3a4/0x5a0
Jun 04 09:23:11 testnode2 kernel:  vfs_write+0x5e3/0xe70
Jun 04 09:23:11 testnode2 kernel:  ksys_write+0xf7/0x1d0
Jun 04 09:23:11 testnode2 kernel:  do_syscall_64+0x93/0x180
Jun 04 09:23:11 testnode2 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: Freed by task 66344:
Jun 04 09:23:11 testnode2 kernel:  kasan_save_stack+0x2c/0x50
Jun 04 09:23:11 testnode2 kernel:  kasan_save_track+0x10/0x30
Jun 04 09:23:11 testnode2 kernel:  kasan_save_free_info+0x37/0x60
Jun 04 09:23:11 testnode2 kernel:  poison_slab_object+0x109/0x180
Jun 04 09:23:11 testnode2 kernel:  __kasan_slab_free+0x2e/0x50
Jun 04 09:23:11 testnode2 kernel:  kfree+0x11a/0x390
Jun 04 09:23:11 testnode2 kernel:  srp_free_ch_ib+0x895/0xc80 [ib_srp]
Jun 04 09:23:11 testnode2 kernel:  srp_remove_work+0x309/0x6c0 [ib_srp]
Jun 04 09:23:11 testnode2 kernel:  process_one_work+0x865/0x1410
Jun 04 09:23:11 testnode2 kernel:  worker_thread+0x5e2/0x1010
Jun 04 09:23:11 testnode2 kernel:  kthread+0x2d1/0x3a0
Jun 04 09:23:11 testnode2 kernel:  ret_from_fork+0x30/0x70
Jun 04 09:23:11 testnode2 kernel:  ret_from_fork_asm+0x1a/0x30
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: The buggy address belongs to the object at ffff888131a3e000
                                   which belongs to the cache kmalloc-2k of size 2048
Jun 04 09:23:11 testnode2 kernel: The buggy address is located 1048 bytes inside of
                                   freed 2048-byte region [ffff888131a3e000, ffff888131a3e800)
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: The buggy address belongs to the physical page:
Jun 04 09:23:11 testnode2 kernel: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888131a38000 pfn:0x131a38
Jun 04 09:23:11 testnode2 kernel: head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
Jun 04 09:23:11 testnode2 kernel: flags: 0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
Jun 04 09:23:11 testnode2 kernel: page_type: 0xffffefff(slab)
Jun 04 09:23:11 testnode2 kernel: raw: 0017ffffc0000240 ffff888100042f00 ffffea0004c89610 ffffea0004a3c010
Jun 04 09:23:11 testnode2 kernel: raw: ffff888131a38000 0000000000080006 00000001ffffefff 0000000000000000
Jun 04 09:23:11 testnode2 kernel: head: 0017ffffc0000240 ffff888100042f00 ffffea0004c89610 ffffea0004a3c010
Jun 04 09:23:11 testnode2 kernel: head: ffff888131a38000 0000000000080006 00000001ffffefff 0000000000000000
Jun 04 09:23:11 testnode2 kernel: head: 0017ffffc0000003 ffffea0004c68e01 ffffffffffffffff 0000000000000000
Jun 04 09:23:11 testnode2 kernel: head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
Jun 04 09:23:11 testnode2 kernel: page dumped because: kasan: bad access detected
Jun 04 09:23:11 testnode2 kernel: 
Jun 04 09:23:11 testnode2 kernel: Memory state around the buggy address:
Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Jun 04 09:23:11 testnode2 kernel: >ffff888131a3e400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Jun 04 09:23:11 testnode2 kernel:                             ^
Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Jun 04 09:23:11 testnode2 kernel:  ffff888131a3e500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Jun 04 09:23:11 testnode2 kernel: ==================================================================
Jun 04 09:23:11 testnode2 kernel: Disabling lock debugging due to kernel taint
Jun 04 09:23:11 testnode2 kernel: device-mapper: multipath: 253:2: Failing path 8:80.
Jun 04 09:23:11 testnode2 kernel: device-mapper: uevent: dm_send_uevents: skipping sending uevent for lost device
...

[3] https://lore.kernel.org/linux-rdma/20230612054237.1855292-1-shinichiro.kawasaki@xxxxxxx/
[4] https://lore.kernel.org/linux-rdma/g2lh3wh6e6yossw2ktqmxx2rf63m36mumqmx4qbtzvxuygsr6h@gpgftgfigllv/
[5] https://lore.kernel.org/linux-rdma/ZIn6ul5jPuxC+uIG@xxxxxxxx/




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux