Re: [PATCH] scsi: alua: fix the race between alua_bus_detach and alua_rtpg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/18/20 5:49 AM, jitendra.khasdev@xxxxxxxxxx wrote:


On 9/17/20 11:00 PM, Ewan D. Milne wrote:
On Tue, 2020-09-15 at 16:28 +0530, Jitendra Khasdev wrote:
This is patch to fix the race occurs between bus detach and alua_rtpg.

It fluses the all pending workqueue in bus detach handler, so it can avoid
race between alua_bus_detach and alua_rtpg.

Here is call trace where race got detected.

multipathd call stack:
[exception RIP: native_queued_spin_lock_slowpath+100]
--- <NMI exception stack> ---
native_queued_spin_lock_slowpath at ffffffff89307f54
queued_spin_lock_slowpath at ffffffff89307c18
_raw_spin_lock_irq at ffffffff89bd797b
alua_bus_detach at ffffffff8984dcc8
scsi_dh_release_device at ffffffff8984b6f2
scsi_device_dev_release_usercontext at ffffffff89846edf
execute_in_process_context at ffffffff892c3e60
scsi_device_dev_release at ffffffff8984637c
device_release at ffffffff89800fbc
kobject_cleanup at ffffffff89bb1196
kobject_put at ffffffff89bb12ea
put_device at ffffffff89801283
scsi_device_put at ffffffff89838d5b
scsi_disk_put at ffffffffc051f650 [sd_mod]
sd_release at ffffffffc051f8a2 [sd_mod]
__blkdev_put at ffffffff8952c79e
blkdev_put at ffffffff8952c80c
blkdev_close at ffffffff8952c8b5
__fput at ffffffff894e55e6
____fput at ffffffff894e57ee
task_work_run at ffffffff892c94dc
exit_to_usermode_loop at ffffffff89204b12
do_syscall_64 at ffffffff892044da
entry_SYSCALL_64_after_hwframe at ffffffff89c001b8

kworker:
[exception RIP: alua_rtpg+2003]
account_entity_dequeue at ffffffff892e42c1
alua_rtpg_work at ffffffff8984f097
process_one_work at ffffffff892c4c29
worker_thread at ffffffff892c5a4f
kthread at ffffffff892cb135
ret_from_fork at ffffffff89c00354

Signed-off-by: Jitendra Khasdev <jitendra.khasdev@xxxxxxxxxx>
---
  drivers/scsi/device_handler/scsi_dh_alua.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index f32da0c..024a752 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1144,6 +1144,9 @@ static void alua_bus_detach(struct scsi_device *sdev)
  	struct alua_dh_data *h = sdev->handler_data;
  	struct alua_port_group *pg;
+ sdev_printk(KERN_INFO, sdev, "%s: flushing workqueues\n", ALUA_DH_NAME);
+	flush_workqueue(kaluad_wq);
+
  	spin_lock(&h->pg_lock);
  	pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock));
  	rcu_assign_pointer(h->pg, NULL);

I'm not sure this is the best solution.  The current code
references h->sdev when the dh_list is traversed.  So it needs
to remain valid.  Fixing it by flushing the workqueue to avoid
the list traversal code running leaves open the possibility that
future code alterations may expose this problem again.

-Ewan



I see your point, but as we are in detach handler and this code path
only execute when device is being detached. So, before detaching, flush
work-queue will take care of any current code references h->sdev where
dh_list is being traversed.

Flushing the workqueue is a bit of an overkill, seeing that we know exactly which workqueue element we're waiting for.

IMO, I do not think it would create any problem for future code
alterations. Or may be I am missing something over here, what could
be possible scenario for that?

Problem is more that I'd like to understand where exactly the race condition is. Can you figure out which spinlock is triggering in your stack trace?

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux