On 3/1/23 3:15 PM, Mike Christie wrote: > On 2/28/23 9:40 PM, zhongjinghua wrote: >>> 在 2023/2/13 11:43, Zhong Jinghua 写道: >>>> From: Zhong Jinghua <zhongjinghua@xxxxxxxxxx> >>>> >>>> A use-after-free problem like below: >>>> >>>> BUG: KASAN: use-after-free in scsi_target_reap+0x6c/0x70 >>>> >>>> Workqueue: scsi_wq_1 __iscsi_unbind_session [scsi_transport_iscsi] >>>> Call trace: >>>> dump_backtrace+0x0/0x320 >>>> show_stack+0x24/0x30 >>>> dump_stack+0xdc/0x128 >>>> print_address_description+0x68/0x278 >>>> kasan_report+0x1e4/0x308 >>>> __asan_report_load4_noabort+0x30/0x40 >>>> scsi_target_reap+0x6c/0x70 >>>> scsi_remove_target+0x430/0x640 >>>> __iscsi_unbind_session+0x164/0x268 [scsi_transport_iscsi] >>>> process_one_work+0x67c/0x1350 >>>> worker_thread+0x370/0xf90 >>>> kthread+0x2a4/0x320 >>>> ret_from_fork+0x10/0x18 >>>> >>>> The problem is caused by a concurrency scenario: >>>> >>>> T0: delete target >>>> // echo 1 > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete >>>> T1: logout >>>> // iscsiadm -m node --logout >>>> >>>> T0 T1 >>>> sdev_store_delete >>>> scsi_remove_device >>>> device_remove_file >>>> __scsi_remove_device >>>> __iscsi_unbind_session >>>> scsi_remove_target >>>> spin_lock_irqsave >>>> list_for_each_entry >>>> scsi_target_reap // starget->reaf 1 -> 0 >>>> kref_get(&starget->reap_ref); >>>> // warn use-after-free. >>>> spin_unlock_irqrestore >>>> scsi_target_reap_ref_release >>>> scsi_target_destroy >>>> ... // delete starget >>>> scsi_target_reap >>>> // UAF >>>> >>>> When T0 reduces the reference count to 0, but has not been released, >>>> T1 can still enter list_for_each_entry, and then kref_get reports UAF. >>>> >>>> Fix it by using kref_get_unless_zero() to check for a reference count of >>>> 0. >>>> >>>> Signed-off-by: Zhong Jinghua <zhongjinghua@xxxxxxxxxx> >>>> --- >>>> drivers/scsi/scsi_sysfs.c | 12 +++++++++++- >>>> 1 file changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >>>> index e7893835b99a..0ad357ff4c59 100644 >>>> --- a/drivers/scsi/scsi_sysfs.c >>>> +++ b/drivers/scsi/scsi_sysfs.c >>>> @@ -1561,7 +1561,17 @@ void scsi_remove_target(struct device *dev) >>>> starget->state == STARGET_CREATED_REMOVE) >>>> continue; >>>> if (starget->dev.parent == dev || &starget->dev == dev) { >>>> - kref_get(&starget->reap_ref); >>>> + >>>> + /* >>>> + * If starget->reap_ref is reduced to 0, it means >>>> + * that other processes are releasing it and >>>> + * there is no need to delete it again >>>> + */ >>>> + if (!kref_get_unless_zero(&starget->reap_ref)) { >>>> + spin_unlock_irqrestore(shost->host_lock, flags); >>>> + goto restart; >>>> + } >>>> + > > Patch looks ok. > > Is there another bug in the existing kref_get_unless_zero(&starget->reap_ref) > call in scsi_alloc_target? > > I think scsi_alloc_target can find the target on the __targets list, and > it's call to kref_get_unless_zero will succeed if we are only above getting > our own ref (we have not done __scsi_remove_target and have not done the > scsi_target_reap call at the end of the function). > > But if scsi_remove_target has set the target state to STARGET_REMOVE, the thread > that did scsi_alloc_target wouldn't be able to put the target into the correct state > (the scsi_target_add call will see the target state and return). So later if the > driver/transport class did scsi_remove_target again to remove the target that > the scsi_alloc_target call re-added, we see the target->state still in STARGET_REMOVE > and it won't get deleted. > > Can we solve both issues at the same time? I looked into this last part of my comment, and I don't think it's possible. I thought we could just change around when we add/delete the target from the __targets list and when the target_alloc/destroy callouts are done, but that is more difficult than it looks.