Re: [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/01/2016 03:27 PM, Mike Snitzer wrote:
On Thu, Sep 01 2016 at  6:22pm -0400,
Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote:

On 09/01/2016 03:18 PM, Mike Snitzer wrote:
FYI I get the same 'dmsetup suspend --nolockfs --noflush mp' hang,
running mptest's test_02_sdev_delete, when I try your unmodified
patchset, see:

http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel.bart

Hello Mike,

Are you aware that the code on that branch is a *modified* version
of my patch series? The following patch is not present on that
branch: "dm path selector: Avoid that device removal triggers an
infinite loop". There are also other (smaller) differences.

No, you're obviously talking about the 'devel' branch and not the
'devel.bart' branch I pointed to.  The 'devel.bart' branch is the
_exact_ patchset you sent.  It has the same problem as the 'devel'
branch.

Hello Mike,

Sorry that I misread your previous e-mail. After I received your latest e-mail I rebased my tree on top of the devel.bart branch mentioned above. My tests still pass. The only two patches in my tree that are relevant and that are not in the devel.bart branch have been attached to this e-mail. Did your test involve the sd driver? If so, do the attached two patches help? If the sd driver was not involved, can you provide more information about the hang you ran into? The output and log messages generated by the following commands after the hang has been reproduced would be very welcome:
* echo w > /proc/sysrq-trigger
* (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list})

Thanks,

Bart.
>From 872971ee76d299e9c002b57b5ae6f12a4bd4ef92 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Thu, 30 Jun 2016 11:53:38 +0200
Subject: [PATCH 1/2] block, dm-mpath: Introduce request flag
 REQ_FAIL_IF_NO_PATH

Let dm-mpath fail requests if queue_if_no_path is enabled and
request flag REQ_FAIL_IF_NO_PATH has been set. This facility will
be used in the next patch to fix a deadlock between SCSI device
removal and sd event checking.

Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
---
 drivers/md/dm-mpath.c     | 6 ++++--
 include/linux/blk_types.h | 4 ++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 8df7b12..58ce8bc 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -540,6 +540,7 @@ static int __multipath_map(struct dm_target *ti, struct request *clone,
 	struct multipath *m = ti->private;
 	int r = DM_MAPIO_REQUEUE;
 	size_t nr_bytes = clone ? blk_rq_bytes(clone) : blk_rq_bytes(rq);
+	bool fail_if_no_path = (clone ? : rq)->cmd_flags & REQ_FAIL_IF_NO_PATH;
 	struct pgpath *pgpath;
 	struct block_device *bdev;
 	struct dm_mpath_io *mpio;
@@ -550,7 +551,7 @@ static int __multipath_map(struct dm_target *ti, struct request *clone,
 		pgpath = choose_pgpath(m, nr_bytes);
 
 	if (!pgpath) {
-		if (!must_push_back_rq(m))
+		if (fail_if_no_path || !must_push_back_rq(m))
 			r = -EIO;	/* Failed */
 		return r;
 	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
@@ -1558,6 +1559,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
 	 * clone bios for it and resubmit it later.
 	 */
 	int r = DM_ENDIO_REQUEUE;
+	bool fail_if_no_path = clone->cmd_flags & REQ_FAIL_IF_NO_PATH;
 
 	if (!error && !clone->errors)
 		return 0;	/* I/O complete */
@@ -1570,7 +1572,7 @@ static int do_end_io(struct multipath *m, struct request *clone,
 
 	if (!atomic_read(&m->nr_valid_paths)) {
 		if (!test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) {
-			if (!must_push_back_rq(m))
+			if (fail_if_no_path || !must_push_back_rq(m))
 				r = -EIO;
 		} else {
 			if (error == -EBADE)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..954e687 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -182,6 +182,9 @@ enum rq_flag_bits {
 	__REQ_PM,		/* runtime pm request */
 	__REQ_HASHED,		/* on IO scheduler merge hash */
 	__REQ_MQ_INFLIGHT,	/* track inflight for MQ */
+
+	__REQ_FAIL_IF_NO_PATH,	/* fail if queued on dm-mpath with no paths */
+
 	__REQ_NR_BITS,		/* stops here */
 };
 
@@ -228,6 +231,7 @@ enum rq_flag_bits {
 #define REQ_PM			(1ULL << __REQ_PM)
 #define REQ_HASHED		(1ULL << __REQ_HASHED)
 #define REQ_MQ_INFLIGHT		(1ULL << __REQ_MQ_INFLIGHT)
+#define REQ_FAIL_IF_NO_PATH	(1ULL << __REQ_FAIL_IF_NO_PATH)
 
 enum req_op {
 	REQ_OP_READ,
-- 
2.9.3

>From 60acb292b29fc7eb40298e8d6dcc409315154340 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Thu, 30 Jun 2016 11:54:06 +0200
Subject: [PATCH 2/2] sd: Fix a deadlock

If sd_check_events() is called for an sd device on top of a
multipath device without paths then the scsi_test_unit_ready()
call from that function will block until one or more paths
have been added to the multipath device. However, if
scsi_remove_host() is called before a path has been added then
a deadlock is triggered. Fix this deadlock by making the
scsi_test_unit_ready() call from sd_check_events() fail if
there are no paths. SysRq-w reported the following call stacks:

sysrq: SysRq : Show Blocked State
  task                        PC stack   pid father
kworker/6:1     D ffff88046346b958     0    95      2 0x00000000
Workqueue: events_freezable_power_ disk_events_workfn
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815b1487>] schedule_timeout+0x157/0x230
 [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
 [<ffffffff815adba9>] wait_for_completion_io_timeout+0xd9/0x110
 [<ffffffff812aa9f8>] blk_execute_rq+0xa8/0x130
 [<ffffffff81408411>] scsi_execute+0xf1/0x160
 [<ffffffff8140a484>] scsi_execute_req_flags+0x84/0xf0
 [<ffffffff8140aabd>] scsi_test_unit_ready+0x7d/0x130
 [<ffffffff81416fe6>] sd_check_events+0x116/0x1a0
 [<ffffffff812b477f>] disk_check_events+0x4f/0x130
 [<ffffffff812b4877>] disk_events_workfn+0x17/0x20
 [<ffffffff810737ed>] process_one_work+0x19d/0x490
 [<ffffffff81073b29>] worker_thread+0x49/0x490
 [<ffffffff8107a0ea>] kthread+0xea/0x100
 [<ffffffff815b26ef>] ret_from_fork+0x1f/0x40
kworker/2:16    D ffff88006d9a3af0     0  2575      2 0x00000000
Workqueue: srp_remove srp_remove_work [ib_srp]
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815ad300>] schedule_preempt_disabled+0x10/0x20
 [<ffffffff815af0b5>] mutex_lock_nested+0x145/0x360
 [<ffffffff812b61ce>] disk_block_events+0x2e/0x80
 [<ffffffff812b62e5>] del_gendisk+0x35/0x1d0
 [<ffffffff81416ae6>] sd_remove+0x56/0xc0
 [<ffffffff813e0812>] __device_release_driver+0x82/0x130
 [<ffffffff813e08e0>] device_release_driver+0x20/0x30
 [<ffffffff813dff33>] bus_remove_device+0x113/0x190
 [<ffffffff813dc5bc>] device_del+0x12c/0x230
 [<ffffffff814113a8>] __scsi_remove_device+0xf8/0x130
 [<ffffffff8140f5d4>] scsi_forget_host+0x64/0x70
 [<ffffffff814033c5>] scsi_remove_host+0x75/0x120
 [<ffffffffa05b5e5b>] srp_remove_work+0x8b/0x1f0 [ib_srp]
 [<ffffffff810737ed>] process_one_work+0x19d/0x490
 [<ffffffff81073b29>] worker_thread+0x49/0x490
 [<ffffffff8107a0ea>] kthread+0xea/0x100
 [<ffffffff815b26ef>] ret_from_fork+0x1f/0x40
multipathd      D ffff88046d91fb40     0  2604      1 0x00000000
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815ad300>] schedule_preempt_disabled+0x10/0x20
 [<ffffffff815af0b5>] mutex_lock_nested+0x145/0x360
 [<ffffffff812b61ce>] disk_block_events+0x2e/0x80
 [<ffffffff811d8953>] __blkdev_get+0x53/0x480
 [<ffffffff811d8ebb>] blkdev_get+0x13b/0x3a0
 [<ffffffff811d9256>] blkdev_open+0x56/0x70
 [<ffffffff81199a46>] do_dentry_open.isra.17+0x146/0x2d0
 [<ffffffff8119ad58>] vfs_open+0x48/0x70
 [<ffffffff811ab213>] path_openat+0x163/0xa10
 [<ffffffff811aca29>] do_filp_open+0x79/0xd0
 [<ffffffff8119b0a6>] do_sys_open+0x116/0x1f0
 [<ffffffff8119b199>] SyS_open+0x19/0x20
 [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
systemd-udevd   D ffff880419273968     0 10074    475 0x00000004
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
 [<ffffffff815ae261>] wait_for_completion+0xd1/0x110
 [<ffffffff81073036>] flush_work+0x1d6/0x2a0
 [<ffffffff8107339b>] __cancel_work_timer+0xfb/0x1b0
 [<ffffffff8107346e>] cancel_delayed_work_sync+0xe/0x10
 [<ffffffff812b621e>] disk_block_events+0x7e/0x80
 [<ffffffff811d8953>] __blkdev_get+0x53/0x480
 [<ffffffff811d8ebb>] blkdev_get+0x13b/0x3a0
 [<ffffffff811d9256>] blkdev_open+0x56/0x70
 [<ffffffff81199a46>] do_dentry_open.isra.17+0x146/0x2d0
 [<ffffffff8119ad58>] vfs_open+0x48/0x70
 [<ffffffff811ab213>] path_openat+0x163/0xa10
 [<ffffffff811aca29>] do_filp_open+0x79/0xd0
 [<ffffffff8119b0a6>] do_sys_open+0x116/0x1f0
 [<ffffffff8119b199>] SyS_open+0x19/0x20
 [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
mount           D ffff8803b42b7be8     0 13567  13442 0x00000000
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
 [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
 [<ffffffff815ad786>] bit_wait_io+0x16/0x60
 [<ffffffff815ad408>] __wait_on_bit+0x58/0x90
 [<ffffffff8111e30f>] wait_on_page_bit_killable+0xbf/0xd0
 [<ffffffff8111e450>] generic_file_read_iter+0x130/0x710
 [<ffffffff811d7970>] blkdev_read_iter+0x30/0x40
 [<ffffffff8119c0e9>] __vfs_read+0xb9/0x120
 [<ffffffff8119c4c0>] vfs_read+0x90/0x130
 [<ffffffff8119d884>] SyS_read+0x44/0xa0
 [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac

Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
---
 drivers/scsi/scsi_lib.c    | 17 +++++++++++++----
 drivers/scsi/sd.c          |  6 ++++--
 include/scsi/scsi_device.h |  2 ++
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index dd4766a..6b172f9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2401,13 +2401,14 @@ EXPORT_SYMBOL(scsi_mode_sense);
  *	@sshdr_external: Optional pointer to struct scsi_sense_hdr for
  *		returning sense. Make sure that this is cleared before passing
  *		in.
+ *	@flags: or-ed into cmd_flags.
  *
  *	Returns zero if unsuccessful or an error if TUR failed.  For
  *	removable media, UNIT_ATTENTION sets ->changed flag.
  **/
 int
-scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries,
-		     struct scsi_sense_hdr *sshdr_external)
+scsi_test_unit_ready_flags(struct scsi_device *sdev, int timeout, int retries,
+			   struct scsi_sense_hdr *sshdr_external, u64 flags)
 {
 	char cmd[] = {
 		TEST_UNIT_READY, 0, 0, 0, 0, 0,
@@ -2422,8 +2423,8 @@ scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries,
 
 	/* try to eat the UNIT_ATTENTION if there are enough retries */
 	do {
-		result = scsi_execute_req(sdev, cmd, DMA_NONE, NULL, 0, sshdr,
-					  timeout, retries, NULL);
+		result = scsi_execute_req_flags(sdev, cmd, DMA_NONE, NULL, 0,
+					sshdr, timeout, retries, NULL, flags);
 		if (sdev->removable && scsi_sense_valid(sshdr) &&
 		    sshdr->sense_key == UNIT_ATTENTION)
 			sdev->changed = 1;
@@ -2434,6 +2435,14 @@ scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries,
 		kfree(sshdr);
 	return result;
 }
+EXPORT_SYMBOL(scsi_test_unit_ready_flags);
+
+int scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries,
+			 struct scsi_sense_hdr *sshdr_external)
+{
+	return scsi_test_unit_ready_flags(sdev, timeout, retries,
+					  sshdr_external, 0);
+}
 EXPORT_SYMBOL(scsi_test_unit_ready);
 
 /**
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..85d864b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1439,8 +1439,10 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
 
 	if (scsi_block_when_processing_errors(sdp)) {
 		sshdr  = kzalloc(sizeof(*sshdr), GFP_KERNEL);
-		retval = scsi_test_unit_ready(sdp, SD_TIMEOUT, SD_MAX_RETRIES,
-					      sshdr);
+		retval = scsi_test_unit_ready_flags(sdp, SD_TIMEOUT,
+				SD_MAX_RETRIES, sshdr, REQ_FAIL_IF_NO_PATH);
+		if (retval)
+			pr_info("%s: TUR %#x\n", __func__, retval);
 	}
 
 	/* failed to execute TUR, assume media not present */
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 8a95631..840030a 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -379,6 +379,8 @@ extern int scsi_mode_select(struct scsi_device *sdev, int pf, int sp,
 			    int timeout, int retries,
 			    struct scsi_mode_data *data,
 			    struct scsi_sense_hdr *);
+extern int scsi_test_unit_ready_flags(struct scsi_device *sdev, int timeout,
+			int retries, struct scsi_sense_hdr *sshdr, u64 flags);
 extern int scsi_test_unit_ready(struct scsi_device *sdev, int timeout,
 				int retries, struct scsi_sense_hdr *sshdr);
 extern int scsi_get_vpd_page(struct scsi_device *, u8 page, unsigned char *buf,
-- 
2.9.3

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux