Hello Bart, On Tue, 2022-11-15 at 14:49 -0800, Bart Van Assche wrote: > Modify alua_rtpg_queue() such that it only requests the caller to > drop > the sdev reference if necessary. This patch fixes a recently > introduced > regression. > > Cc: Sachin Sant <sachinp@xxxxxxxxxxxxx> > Cc: Hannes Reinecke <hare@xxxxxxx> > Cc: Martin Wilck <mwilck@xxxxxxxx> > Reported-by: Sachin Sant <sachinp@xxxxxxxxxxxxx> > Fixes: 0b25e17e9018 ("scsi: alua: Move a scsi_device_put() call out > of alua_check_vpd()") > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> Is this complexity really necessary, just for calling scsi_device_put()? 0b25e17e9018 is supposed to avoid sleeping under &h->pg_lock. Have you considered simply not calling alua_rtpg_queue() with this lock held? alua_rtpg_queue() accesses no data that is protected by &h->pg_lock (which is just h->pg, AFAIU). Would it perhaps be sufficient to just make sure we keep a kref for the pg struct in alua_check_vpd(), like the patch below (on mkp/queue, meant as an alternative to 0b25e17e9018)? Regards Martin >From 04df5933239921e7e7ac00e9ec0558124b050a51 Mon Sep 17 00:00:00 2001 From: Martin Wilck <mwilck@xxxxxxxx> Date: Wed, 16 Nov 2022 11:24:58 +0100 Subject: [PATCH] scsi: alua: alua_check_vpd: drop pg_lock before calling alua_rtpg_queue Since commit f93ed747e2c7 ("scsi: core: Release SCSI devices synchronously"), scsi_device_put() might sleep. Avoid calling it from alua_rtpg_queue() with the pg_lock held. The lock only pretects h->pg, anyway. To avoid the pg being freed under us, because of a race with another thread, take a temporary reference. In alua_rtpg_queue(), verify that the pg still belongs to the sdev being passed before actually queueing the RTPG. Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> --- drivers/scsi/device_handler/scsi_dh_alua.c | 30 +++++++++++++++------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 610a51538f03..905b49493e01 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -372,12 +372,13 @@ static int alua_check_vpd(struct scsi_device *sdev, struct alua_dh_data *h, if (pg_updated) list_add_rcu(&h->node, &pg->dh_list); spin_unlock_irqrestore(&pg->lock, flags); - - alua_rtpg_queue(rcu_dereference_protected(h->pg, - lockdep_is_held(&h->pg_lock)), - sdev, NULL, true); spin_unlock(&h->pg_lock); + if (kref_get_unless_zero(&pg->kref)) { + alua_rtpg_queue(pg, sdev, NULL, true); + kref_put(&pg->kref, release_port_group); + } + if (old_pg) kref_put(&old_pg->kref, release_port_group); @@ -986,11 +987,22 @@ static bool alua_rtpg_queue(struct alua_port_group *pg, force = true; } if (pg->rtpg_sdev == NULL) { - pg->interval = 0; - pg->flags |= ALUA_PG_RUN_RTPG; - kref_get(&pg->kref); - pg->rtpg_sdev = sdev; - start_queue = 1; + struct alua_dh_data *h = sdev->handler_data; + struct alua_port_group *sdev_pg = NULL; + + if (h) { + rcu_read_lock(); + sdev_pg = rcu_dereference(h->pg); + rcu_read_unlock(); + } + + if (pg == sdev_pg) { + pg->flags |= ALUA_PG_RUN_RTPG; + pg->interval = 0; + kref_get(&pg->kref); + pg->rtpg_sdev = sdev; + start_queue = 1; + } } else if (!(pg->flags & ALUA_PG_RUN_RTPG) && force) { pg->flags |= ALUA_PG_RUN_RTPG; /* Do not queue if the worker is already running */ -- 2.38.0