On Tue, 2017-04-18 at 08:56 -0700, James Bottomley wrote:
> How about this approach. It goes straight to DEL if the device is
> blocked (skipping CANCEL). This means that all the commands issued in
> ->shutdown will error in the mid-layer, thus making the removal proceed
> without being stopped.
Hello James,
The three attached patches pass my tests. Please let me know how you would
like to proceed with patch 1/3. Would you like to submit it yourself or is
it OK for you if I mention you as author and add your Signed-off-by?
Thanks,
Bart.
From 9482fdc8b322f15ced6f64d57f45026367604a23 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Tue, 18 Apr 2017 10:11:02 -0700
Subject: [PATCH 1/3] Make __scsi_remove_device go straight from BLOCKED to DEL
If a device is blocked, make __scsi_remove_device() cause it to
transition to the DEL state. This means that all the commands
issued in .shutdown() will error in the mid-layer, thus making
the removal proceed without being stopped.
This patch is a slightly modified version of a patch from James
Bottomley.
Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Cc: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>
Cc: Israel Rukshin <israelr@xxxxxxxxxxxx>
Cc: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxx>
Cc: Benjamin Block <bblock@xxxxxxxxxxxxxxxxxx>
---
drivers/scsi/scsi_lib.c | 2 +-
drivers/scsi/scsi_sysfs.c | 12 +++++++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index eecc005099b2..277c8b3ae7b0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2611,7 +2611,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
case SDEV_QUIESCE:
case SDEV_OFFLINE:
case SDEV_TRANSPORT_OFFLINE:
- case SDEV_BLOCK:
break;
default:
goto illegal;
@@ -2625,6 +2624,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
case SDEV_OFFLINE:
case SDEV_TRANSPORT_OFFLINE:
case SDEV_CANCEL:
+ case SDEV_BLOCK:
case SDEV_CREATED_BLOCK:
break;
default:
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 82dfe07b1d47..f95d191ec809 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1282,9 +1282,19 @@ void __scsi_remove_device(struct scsi_device *sdev)
return;
if (sdev->is_visible) {
- if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
+ /*
+ * If blocked, we go straight to DEL so any commands
+ * issued during the driver shutdown (like sync cache)
+ * are errored.
+ */
+ if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0 &&
+ scsi_device_set_state(sdev, SDEV_DEL) != 0)
return;
+ if (sdev->sdev_state == SDEV_DEL)
+ sdev_printk(KERN_DEBUG, sdev,
+ "Changed state from BLOCKED to DEL\n");
+
bsg_unregister_queue(sdev->request_queue);
device_unregister(&sdev->sdev_dev);
transport_remove_device(dev);
--
2.12.2
From c3f85b714fcfb12d43669b7f295a09f4718c2704 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Tue, 28 Mar 2017 14:00:17 -0700
Subject: [PATCH 2/3] Introduce scsi_start_queue()
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Cc: Israel Rukshin <israelr@xxxxxxxxxxxx>
Cc: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxx>
Cc: Benjamin Block <bblock@xxxxxxxxxxxxxxxxxx>
---
drivers/scsi/scsi_lib.c | 25 +++++++++++++++----------
drivers/scsi/scsi_priv.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 277c8b3ae7b0..376cd1da102c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2987,6 +2987,20 @@ scsi_internal_device_block(struct scsi_device *sdev, bool wait)
}
EXPORT_SYMBOL_GPL(scsi_internal_device_block);
+void scsi_start_queue(struct scsi_device *sdev)
+{
+ struct request_queue *q = sdev->request_queue;
+ unsigned long flags;
+
+ if (q->mq_ops) {
+ blk_mq_start_stopped_hw_queues(q, false);
+ } else {
+ spin_lock_irqsave(q->queue_lock, flags);
+ blk_start_queue(q);
+ spin_unlock_irqrestore(q->queue_lock, flags);
+ }
+}
+
/**
* scsi_internal_device_unblock - resume a device after a block request
* @sdev: device to resume
@@ -3007,9 +3021,6 @@ int
scsi_internal_device_unblock(struct scsi_device *sdev,
enum scsi_device_state new_state)
{
- struct request_queue *q = sdev->request_queue;
- unsigned long flags;
-
/*
* Try to transition the scsi device to SDEV_RUNNING or one of the
* offlined states and goose the device queue if successful.
@@ -3027,13 +3038,7 @@ scsi_internal_device_unblock(struct scsi_device *sdev,
sdev->sdev_state != SDEV_OFFLINE)
return -EINVAL;
- if (q->mq_ops) {
- blk_mq_start_stopped_hw_queues(q, false);
- } else {
- spin_lock_irqsave(q->queue_lock, flags);
- blk_start_queue(q);
- spin_unlock_irqrestore(q->queue_lock, flags);
- }
+ scsi_start_queue(sdev);
return 0;
}
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index f11bd102d6d5..c7629e31a75b 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -89,6 +89,7 @@ extern void scsi_run_host_queues(struct Scsi_Host *shost);
extern void scsi_requeue_run_queue(struct work_struct *work);
extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev);
+extern void scsi_start_queue(struct scsi_device *sdev);
extern int scsi_mq_setup_tags(struct Scsi_Host *shost);
extern void scsi_mq_destroy_tags(struct Scsi_Host *shost);
extern int scsi_init_queue(void);
--
2.12.2
From c383551a721d30d897d45244acd331ff0af94656 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Tue, 28 Mar 2017 14:00:25 -0700
Subject: [PATCH 3/3] Avoid that __scsi_remove_device() hangs
Since scsi_target_unblock() uses starget_for_each_device(), since
starget_for_each_device() uses scsi_device_get(), since
scsi_device_get() fails after unloading of the LLD kernel module
has been started scsi_target_unblock() may skip devices that were
affected by scsi_target_block(). Ensure that __scsi_remove_device()
does not hang for blocked SCSI devices. This patch avoids that
unloading the ib_srp kernel module can trigger the following hang:
Call Trace:
schedule+0x35/0x80
schedule_timeout+0x237/0x2d0
io_schedule_timeout+0xa6/0x110
wait_for_completion_io+0xa3/0x110
blk_execute_rq+0xdf/0x120
scsi_execute+0xce/0x150 [scsi_mod]
scsi_execute_req_flags+0x8f/0xf0 [scsi_mod]
sd_sync_cache+0xa9/0x190 [sd_mod]
sd_shutdown+0x6a/0x100 [sd_mod]
sd_remove+0x64/0xc0 [sd_mod]
__device_release_driver+0x8d/0x120
device_release_driver+0x1e/0x30
bus_remove_device+0xf9/0x170
device_del+0x127/0x240
__scsi_remove_device+0xc1/0xd0 [scsi_mod]
scsi_forget_host+0x57/0x60 [scsi_mod]
scsi_remove_host+0x72/0x110 [scsi_mod]
srp_remove_work+0x8b/0x200 [ib_srp]
Reported-by: Israel Rukshin <israelr@xxxxxxxxxxxx>
Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Cc: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxx>
Cc: Benjamin Block <bblock@xxxxxxxxxxxxxxxxxx>
---
drivers/scsi/scsi_sysfs.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index f95d191ec809..e9e80241ab5e 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1309,6 +1309,15 @@ void __scsi_remove_device(struct scsi_device *sdev)
* device.
*/
scsi_device_set_state(sdev, SDEV_DEL);
+ /*
+ * Since scsi_target_unblock() is a no-op after unloading of the SCSI
+ * LLD has started, explicitly restart the queue. Do this after the
+ * device state has been changed into SDEV_DEL because
+ * scsi_prep_state_check() returns BLKPREP_KILL for the SDEV_DEL state
+ * Do this before calling blk_cleanup_queue() to avoid that that
+ * function encounters a stopped queue.
+ */
+ scsi_start_queue(sdev);
blk_cleanup_queue(sdev->request_queue);
cancel_work_sync(&sdev->requeue_work);
--
2.12.2