Re: dm + blk-mq soft lockup complaint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/13/15 17:21, Mike Snitzer wrote:
> OK, I assume you specified the mpath device for the test that failed.

Yes, of course ...

> This test works fine on my 100MB scsi_debug device with 4 paths exported
> over virtio-blk to a guest that assembles the mpath device.
> 
> Could be a hang that is unique to scsi-mq.
> 
> Any chance you'd be willing to provide a HOWTO for setting up your
> SRP/iscsi configuration?
> 
> Are you carrying any related changes that are not upstream?  (I can hunt
> down the email in this thread where you describe your kernel tree...)
> 
> I'll try to reproduce but this info could be useful to others that are
> more scsi-mq inclined who might need to chase this too.

The four patches I had used in my tests at the initiator side and that
are not yet in v3.19-rc4 have been attached to this e-mail (I have not
yet had the time to post all of these patches for review).

This is how my I had configured the initiator system:
* If the version of the srptools package supplied by your distro is
lower than 1.0.2, build and install the latest version from the source
code available at git://git.openfabrics.org/~bvanassche/srptools.git/.git.
* Install the latest version of lsscsi
(http://sg.danny.cz/scsi/lsscsi.html). This version has SRP transport
support but is not yet in any distro AFAIK.
* Build and install a kernel >= v3.19-rc4 that includes the dm patches
at the start of this e-mail thread.
* Check whether the IB links are up (should display "State: Active"):
ibstat | grep State:
* Spread completion interrupts statically over CPU cores, e.g. via the
attached script (spread-mlx4-ib-interrupts).
* Check whether the SRP target system is visible from the SRP initiator
system - the command below should print at least one line:
ibsrpdm -c
* Enable blk-mq:
echo Y > /sys/module/scsi_mod/parameters/use_blk_mq
* Configure the SRP kernel module parameters as follows:
echo 'options ib_srp cmd_sg_entries=255 dev_loss_tmo=60 ch_count=6' >
/etc/modprobe.d/ib_srp.conf
* Unload and reload the SRP initiator kernel module to apply these
parameters:
rmmod ib_srp; modprobe ib_srp
* Start srpd and wait until SRP login has finished:
systemctl start srpd
while ! lsscsi -t | grep -q srp:; do sleep 1; done
* Start multipathd and check the table it has built:
systemctl start multipathd
dmsetup table /dev/dm-0
* Set the I/O scheduler to noop, disable add_random and set rq_affinity
to 2 for all SRP and dm block devices.
* Run the I/O load of your preference.

Please let me know if you need any further information.

Bart.
>From 664b7adce6c09b9c939b4983f7f32b7539497ef4 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@xxxxxxx>
Date: Fri, 2 Jan 2015 14:52:07 +0100
Subject: [PATCH 1/4] e1000: Avoid that e1000_netpoll() triggers a kernel
 warning

console_cont_flush(), which is called by console_unlock(), calls
call_console_drivers() and hence also the netconsole function
write_msg() with local interrupts disabled. This means that it is
not allowed to call disable_irq() from inside a netpoll callback
function. Hence eliminate the disable_irq() / enable_irq() pair
from the e1000 netpoll function. This patch avoids that the e1000
networking driver triggers the following complaint:

BUG: sleeping function called from invalid context at kernel/irq/manage.c:104

Call Trace:
 [<ffffffff814d1ec5>] dump_stack+0x4c/0x65
 [<ffffffff8107bcc5>] ___might_sleep+0x175/0x230
 [<ffffffff8107bdba>] __might_sleep+0x3a/0xa0
 [<ffffffff810a78c8>] synchronize_irq+0x38/0xa0
 [<ffffffff810a7a20>] disable_irq+0x20/0x30
 [<ffffffffa04b4442>] e1000_netpoll+0x102/0x130 [e1000e]
 [<ffffffff813ffff2>] netpoll_poll_dev+0x72/0x350
 [<ffffffff81400489>] netpoll_send_skb_on_dev+0x1b9/0x2b0
 [<ffffffff81400842>] netpoll_send_udp+0x2c2/0x430
 [<ffffffffa058187f>] write_msg+0xcf/0x120 [netconsole]
 [<ffffffff810a4682>] call_console_drivers.constprop.25+0xc2/0x250
 [<ffffffff810a5588>] console_unlock+0x328/0x4c0
 [<ffffffff810a59f0>] vprintk_emit+0x2d0/0x570
 [<ffffffff810a5def>] vprintk_default+0x1f/0x30
 [<ffffffff814cf680>] printk+0x46/0x48

See also "[RFC PATCH net-next 00/11] net: remove disable_irq() from
->ndo_poll_controller" (http://thread.gmane.org/gmane.linux.network/342096).

See also patch "sched/wait: Add might_sleep() checks" (kernel v3.19-rc1;
commit e22b886a8a43).

Reported-by: Sabrina Dubroca <sd@xxxxxxxxxxxxxxx>
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: David S. Miller <davem@xxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
---
 drivers/net/ethernet/intel/e1000/e1000.h      |  5 +++++
 drivers/net/ethernet/intel/e1000/e1000_main.c | 27 ++++++++++++++++++++++-----
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index 6970710..d85d19f 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -323,6 +323,11 @@ struct e1000_adapter {
 	struct delayed_work watchdog_task;
 	struct delayed_work fifo_stall_task;
 	struct delayed_work phy_info_task;
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	/* Used to serialize e1000 interrupts and the e1000 netpoll callback. */
+	spinlock_t netpoll_lock;
+#endif
 };
 
 enum e1000_state_t {
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 83140cb..e5866f1 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1313,6 +1313,9 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
 	e1000_irq_disable(adapter);
 
 	spin_lock_init(&adapter->stats_lock);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	spin_lock_init(&adapter->netpoll_lock);
+#endif
 
 	set_bit(__E1000_DOWN, &adapter->flags);
 
@@ -3747,10 +3750,8 @@ void e1000_update_stats(struct e1000_adapter *adapter)
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr(int irq, void *data)
+static irqreturn_t __e1000_intr(int irq, struct e1000_adapter *adapter)
 {
-	struct net_device *netdev = data;
-	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	u32 icr = er32(ICR);
 
@@ -3792,6 +3793,24 @@ static irqreturn_t e1000_intr(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr(int irq, void *data)
+{
+	struct net_device *netdev = data;
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	irqreturn_t ret;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	unsigned long flags;
+
+	spin_lock_irqsave(&adapter->netpoll_lock, flags);
+	ret = __e1000_intr(irq, adapter);
+	spin_unlock_irqrestore(&adapter->netpoll_lock, flags);
+#else
+	ret = __e1000_intr(irq, adapter);
+#endif
+
+	return ret;
+}
+
 /**
  * e1000_clean - NAPI Rx polling callback
  * @adapter: board private structure
@@ -5216,9 +5235,7 @@ static void e1000_netpoll(struct net_device *netdev)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
-	disable_irq(adapter->pdev->irq);
 	e1000_intr(adapter->pdev->irq, netdev);
-	enable_irq(adapter->pdev->irq);
 }
 #endif
 
-- 
2.1.2

>From af2c97b3882a73f9b5a098e6aca322efb341ce6d Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@xxxxxxx>
Date: Mon, 5 Jan 2015 11:40:23 +0100
Subject: [PATCH 2/4] e1000e: Avoid that e1000_netpoll() triggers a kernel
 warning

---
 drivers/net/ethernet/intel/e1000e/e1000.h  |  5 ++
 drivers/net/ethernet/intel/e1000e/netdev.c | 73 ++++++++++++++++++++++++------
 2 files changed, 65 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/e1000.h b/drivers/net/ethernet/intel/e1000e/e1000.h
index 7785240..e89b80f 100644
--- a/drivers/net/ethernet/intel/e1000e/e1000.h
+++ b/drivers/net/ethernet/intel/e1000e/e1000.h
@@ -344,6 +344,11 @@ struct e1000_adapter {
 	struct ptp_clock_info ptp_clock_info;
 
 	u16 eee_advert;
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	/* Used to serialize e1000 interrupts and the e1000 netpoll callback. */
+	spinlock_t netpoll_lock;
+#endif
 };
 
 struct e1000_info {
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index e14fd85..c6d0ffb 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1761,11 +1761,10 @@ static void e1000e_downshift_workaround(struct work_struct *work)
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
+static irqreturn_t __e1000_intr_msi(struct e1000_adapter *adapter)
 {
-	struct net_device *netdev = data;
-	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	struct net_device *netdev = adapter->netdev;
 	u32 icr = er32(ICR);
 
 	/* read ICR disables interrupts using IAM */
@@ -1823,16 +1822,32 @@ static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
+{
+	struct e1000_adapter *adapter = netdev_priv(data);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&adapter->netpoll_lock, flags);
+	ret = __e1000_intr_msi(adapter);
+	spin_unlock_irqrestore(&adapter->netpoll_lock, flags);
+
+	return ret;
+#else
+	return __e1000_intr_msi(adapter);
+#endif
+}
+
 /**
  * e1000_intr - Interrupt Handler
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr(int __always_unused irq, void *data)
+static irqreturn_t __e1000_intr(struct e1000_adapter *adapter)
 {
-	struct net_device *netdev = data;
-	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	struct net_device *netdev = adapter->netdev;
 	u32 rctl, icr = er32(ICR);
 
 	if (!icr || test_bit(__E1000_DOWN, &adapter->state))
@@ -1903,6 +1918,23 @@ static irqreturn_t e1000_intr(int __always_unused irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr(int __always_unused irq, void *data)
+{
+	struct e1000_adapter *adapter = netdev_priv(data);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&adapter->netpoll_lock, flags);
+	ret = __e1000_intr(adapter);
+	spin_unlock_irqrestore(&adapter->netpoll_lock, flags);
+
+	return ret;
+#else
+	return __e1000_intr(adapter);
+#endif
+}
+
 static irqreturn_t e1000_msix_other(int __always_unused irq, void *data)
 {
 	struct net_device *netdev = data;
@@ -4180,6 +4212,9 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
 	adapter->rx_ring_count = E1000_DEFAULT_RXD;
 
 	spin_lock_init(&adapter->stats64_lock);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	spin_lock_init(&adapter->netpoll_lock);
+#endif
 
 	e1000e_set_interrupt_capability(adapter);
 
@@ -6437,10 +6472,9 @@ static void e1000_shutdown(struct pci_dev *pdev)
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
 
-static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data)
+static irqreturn_t __e1000_intr_msix(struct e1000_adapter *adapter)
 {
-	struct net_device *netdev = data;
-	struct e1000_adapter *adapter = netdev_priv(netdev);
+	struct net_device *netdev = adapter->netdev;
 
 	if (adapter->msix_entries) {
 		int vector, msix_irq;
@@ -6467,6 +6501,23 @@ static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr_msix(int __always_unused irq, void *data)
+{
+	struct e1000_adapter *adapter = netdev_priv(data);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	int ret;
+	unsigned long flags;
+
+	spin_lock_irqsave(&adapter->netpoll_lock, flags);
+	ret = __e1000_intr_msix(adapter);
+	spin_unlock_irqrestore(&adapter->netpoll_lock, flags);
+
+	return ret;
+#else
+	return __e1000_intr_msix(adapter);
+#endif
+}
+
 /**
  * e1000_netpoll
  * @netdev: network interface device structure
@@ -6484,14 +6535,10 @@ static void e1000_netpoll(struct net_device *netdev)
 		e1000_intr_msix(adapter->pdev->irq, netdev);
 		break;
 	case E1000E_INT_MODE_MSI:
-		disable_irq(adapter->pdev->irq);
 		e1000_intr_msi(adapter->pdev->irq, netdev);
-		enable_irq(adapter->pdev->irq);
 		break;
 	default:		/* E1000E_INT_MODE_LEGACY */
-		disable_irq(adapter->pdev->irq);
 		e1000_intr(adapter->pdev->irq, netdev);
-		enable_irq(adapter->pdev->irq);
 		break;
 	}
 }
-- 
2.1.2

>From 54e10ead0a0e98bdf39a631c65c0a585211ffa22 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@xxxxxxx>
Date: Mon, 5 Jan 2015 10:51:13 +0100
Subject: [PATCH 3/4] Avoid that sd_shutdown() triggers a kernel warning

Since kernel v3.19-rc1 module_refcount() returns 1 instead of 0
when called from inside module_exit(). This breaks the
module_refcount() test in scsi_device_put() and hence causes the
following kernel warning to be reported when unloading the ib_srp
kernel module:

WARNING: CPU: 5 PID: 228 at kernel/module.c:954 module_put+0x207/0x220()

Call Trace:
 [<ffffffff814d1fcf>] dump_stack+0x4c/0x65
 [<ffffffff81053ada>] warn_slowpath_common+0x8a/0xc0
 [<ffffffff81053bca>] warn_slowpath_null+0x1a/0x20
 [<ffffffff810d0507>] module_put+0x207/0x220
 [<ffffffffa000bea8>] scsi_device_put+0x48/0x50 [scsi_mod]
 [<ffffffffa03676d2>] scsi_disk_put+0x32/0x50 [sd_mod]
 [<ffffffffa0368d4c>] sd_shutdown+0x8c/0x150 [sd_mod]
 [<ffffffffa0368e79>] sd_remove+0x69/0xc0 [sd_mod]
 [<ffffffff813457ef>] __device_release_driver+0x7f/0xf0
 [<ffffffff81345885>] device_release_driver+0x25/0x40
 [<ffffffff81345134>] bus_remove_device+0x124/0x1b0
 [<ffffffff8134189e>] device_del+0x13e/0x250
 [<ffffffffa001cdcd>] __scsi_remove_device+0xcd/0xe0 [scsi_mod]
 [<ffffffffa001b39f>] scsi_forget_host+0x6f/0x80 [scsi_mod]
 [<ffffffffa000d5f6>] scsi_remove_host+0x86/0x140 [scsi_mod]
 [<ffffffffa07d5c0b>] srp_remove_work+0x9b/0x210 [ib_srp]
 [<ffffffff8106fd28>] process_one_work+0x1d8/0x780
 [<ffffffff810703eb>] worker_thread+0x11b/0x4a0
 [<ffffffff81075a6f>] kthread+0xef/0x110
 [<ffffffff814dad6c>] ret_from_fork+0x7c/0xb0

See also patch "module: Remove stop_machine from module unloading"
(Masami Hiramatsu; commit e513cc1c07e2; kernel v3.19-rc1).

Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxx>
---
 drivers/scsi/scsi.c        | 63 ++++++++++++++++++++++++++++++++--------------
 drivers/scsi/sd.c          | 44 +++++++++++++++++---------------
 include/scsi/scsi_device.h |  2 ++
 3 files changed, 70 insertions(+), 39 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index e028854..2cae46b 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -973,30 +973,63 @@ int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer,
 EXPORT_SYMBOL(scsi_report_opcode);
 
 /**
- * scsi_device_get  -  get an additional reference to a scsi_device
+ * scsi_dev_get - get an additional reference to a scsi_device
  * @sdev:	device to get a reference to
+ * @get_lld:    whether or not to increase the LLD kernel module refcount
  *
- * Description: Gets a reference to the scsi_device and increments the use count
- * of the underlying LLDD module.  You must hold host_lock of the
- * parent Scsi_Host or already have a reference when calling this.
+ * Description: Gets a reference to the scsi_device and optionally increments
+ * the use count of the associated LLDD module. You must hold host_lock of
+ * the parent Scsi_Host or already have a reference when calling this.
  */
-int scsi_device_get(struct scsi_device *sdev)
+int scsi_dev_get(struct scsi_device *sdev, bool get_lld)
 {
 	if (sdev->sdev_state == SDEV_DEL)
 		return -ENXIO;
 	if (!get_device(&sdev->sdev_gendev))
 		return -ENXIO;
-	/* We can fail this if we're doing SCSI operations
-	 * from module exit (like cache flush) */
-	try_module_get(sdev->host->hostt->module);
+	/* Can fail if invoked during module exit (like cache flush) */
+	if (get_lld && !try_module_get(sdev->host->hostt->module)) {
+		put_device(&sdev->sdev_gendev);
+		return -ENXIO;
+	}
 
 	return 0;
 }
+EXPORT_SYMBOL(scsi_dev_get);
+
+/**
+ * scsi_dev_put - release a reference to a scsi_device
+ * @sdev:	device to release a reference on
+ * @put_lld:    whether or not to decrease the LLD kernel module refcount
+ *
+ * Description: Release a reference to the scsi_device. The device is freed
+ * once the last user vanishes.
+ */
+void scsi_dev_put(struct scsi_device *sdev, bool put_lld)
+{
+	if (put_lld)
+		module_put(sdev->host->hostt->module);
+	put_device(&sdev->sdev_gendev);
+}
+EXPORT_SYMBOL(scsi_dev_put);
+
+/**
+ * scsi_device_get - get an additional reference to a scsi_device
+ * @sdev:	device to get a reference to
+ *
+ * Description: Gets a reference to the scsi_device and increments the use count
+ * of the underlying LLDD module.  You must hold host_lock of the
+ * parent Scsi_Host or already have a reference when calling this.
+ */
+int scsi_device_get(struct scsi_device *sdev)
+{
+	return scsi_dev_get(sdev, true);
+}
 EXPORT_SYMBOL(scsi_device_get);
 
 /**
- * scsi_device_put  -  release a reference to a scsi_device
- * @sdev:	device to release a reference on.
+ * scsi_device_put - release a reference to a scsi_device
+ * @sdev:	device to release a reference on
  *
  * Description: Release a reference to the scsi_device and decrements the use
  * count of the underlying LLDD module.  The device is freed once the last
@@ -1004,15 +1037,7 @@ EXPORT_SYMBOL(scsi_device_get);
  */
 void scsi_device_put(struct scsi_device *sdev)
 {
-#ifdef CONFIG_MODULE_UNLOAD
-	struct module *module = sdev->host->hostt->module;
-
-	/* The module refcount will be zero if scsi_device_get()
-	 * was called from a module removal routine */
-	if (module && module_refcount(module) != 0)
-		module_put(module);
-#endif
-	put_device(&sdev->sdev_gendev);
+	scsi_dev_put(sdev, true);
 }
 EXPORT_SYMBOL(scsi_device_put);
 
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3995169..bd641a8 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -564,13 +564,13 @@ static int sd_major(int major_idx)
 	}
 }
 
-static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
+static struct scsi_disk *__scsi_disk_get(struct gendisk *disk, bool get_lld)
 {
 	struct scsi_disk *sdkp = NULL;
 
 	if (disk->private_data) {
 		sdkp = scsi_disk(disk);
-		if (scsi_device_get(sdkp->device) == 0)
+		if (scsi_dev_get(sdkp->device, get_lld) == 0)
 			get_device(&sdkp->dev);
 		else
 			sdkp = NULL;
@@ -578,35 +578,36 @@ static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
 	return sdkp;
 }
 
-static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
+static struct scsi_disk *scsi_disk_get(struct gendisk *disk, bool get_lld)
 {
 	struct scsi_disk *sdkp;
 
 	mutex_lock(&sd_ref_mutex);
-	sdkp = __scsi_disk_get(disk);
+	sdkp = __scsi_disk_get(disk, get_lld);
 	mutex_unlock(&sd_ref_mutex);
 	return sdkp;
 }
 
-static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev)
+static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev,
+						bool get_lld)
 {
 	struct scsi_disk *sdkp;
 
 	mutex_lock(&sd_ref_mutex);
 	sdkp = dev_get_drvdata(dev);
 	if (sdkp)
-		sdkp = __scsi_disk_get(sdkp->disk);
+		sdkp = __scsi_disk_get(sdkp->disk, get_lld);
 	mutex_unlock(&sd_ref_mutex);
 	return sdkp;
 }
 
-static void scsi_disk_put(struct scsi_disk *sdkp)
+static void scsi_disk_put(struct scsi_disk *sdkp, bool put_lld)
 {
 	struct scsi_device *sdev = sdkp->device;
 
 	mutex_lock(&sd_ref_mutex);
 	put_device(&sdkp->dev);
-	scsi_device_put(sdev);
+	scsi_dev_put(sdev, put_lld);
 	mutex_unlock(&sd_ref_mutex);
 }
 
@@ -1184,7 +1185,7 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
  **/
 static int sd_open(struct block_device *bdev, fmode_t mode)
 {
-	struct scsi_disk *sdkp = scsi_disk_get(bdev->bd_disk);
+	struct scsi_disk *sdkp = scsi_disk_get(bdev->bd_disk, true);
 	struct scsi_device *sdev;
 	int retval;
 
@@ -1239,7 +1240,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 	return 0;
 
 error_out:
-	scsi_disk_put(sdkp);
+	scsi_disk_put(sdkp, true);
 	return retval;	
 }
 
@@ -1273,7 +1274,7 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
 	 * XXX is followed by a "rmmod sd_mod"?
 	 */
 
-	scsi_disk_put(sdkp);
+	scsi_disk_put(sdkp, true);
 }
 
 static int sd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
@@ -1525,11 +1526,11 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
 
 static void sd_rescan(struct device *dev)
 {
-	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
 
 	if (sdkp) {
 		revalidate_disk(sdkp->disk);
-		scsi_disk_put(sdkp);
+		scsi_disk_put(sdkp, true);
 	}
 }
 
@@ -3143,11 +3144,14 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
 /*
  * Send a SYNCHRONIZE CACHE instruction down to the device through
  * the normal SCSI command structure.  Wait for the command to
- * complete.
+ * complete.  Since this function can be called during SCSI LLD kernel
+ * module unload and since try_module_get() fails after kernel module
+ * unload has started this function must not try to increase the SCSI
+ * LLD kernel module refcount.
  */
 static void sd_shutdown(struct device *dev)
 {
-	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, false);
 
 	if (!sdkp)
 		return;         /* this can happen */
@@ -3166,12 +3170,12 @@ static void sd_shutdown(struct device *dev)
 	}
 
 exit:
-	scsi_disk_put(sdkp);
+	scsi_disk_put(sdkp, false);
 }
 
 static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
 {
-	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
 	int ret = 0;
 
 	if (!sdkp)
@@ -3197,7 +3201,7 @@ static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
 	}
 
 done:
-	scsi_disk_put(sdkp);
+	scsi_disk_put(sdkp, true);
 	return ret;
 }
 
@@ -3213,7 +3217,7 @@ static int sd_suspend_runtime(struct device *dev)
 
 static int sd_resume(struct device *dev)
 {
-	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
 	int ret = 0;
 
 	if (!sdkp->device->manage_start_stop)
@@ -3223,7 +3227,7 @@ static int sd_resume(struct device *dev)
 	ret = sd_start_stop_device(sdkp, 1);
 
 done:
-	scsi_disk_put(sdkp);
+	scsi_disk_put(sdkp, true);
 	return ret;
 }
 
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 3a4edd1..a4cb852 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -330,6 +330,8 @@ extern void scsi_remove_device(struct scsi_device *);
 extern int scsi_unregister_device_handler(struct scsi_device_handler *scsi_dh);
 void scsi_attach_vpd(struct scsi_device *sdev);
 
+extern int scsi_dev_get(struct scsi_device *, bool get_lld);
+extern void scsi_dev_put(struct scsi_device *, bool put_lld);
 extern int scsi_device_get(struct scsi_device *);
 extern void scsi_device_put(struct scsi_device *);
 extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
-- 
2.1.2

>From 6f593a0e9fcfd9b6c99fd24ac981450ed6eb0a0f Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@xxxxxxx>
Date: Thu, 8 Jan 2015 09:42:45 +0100
Subject: [PATCH 4/4] IB/srp: Process REQ_PREEMPT requests correctly

Reported-by: Max Gurtuvoy <maxg@xxxxxxxxxxxx>
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 0747c05..77a7a2f 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2003,8 +2003,13 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	if (in_scsi_eh)
 		mutex_lock(&rport->mutex);
 
+	/*
+	 * The "blocked" state of SCSI devices is ignored by the SCSI core for
+	 * REQ_PREEMPT requests. Hence the explicit check below for the SCSI
+	 * device state.
+	 */
 	scmnd->result = srp_chkready(target->rport);
-	if (unlikely(scmnd->result))
+	if (unlikely(scmnd->result != 0 || scsi_device_blocked(scmnd->device)))
 		goto err;
 
 	WARN_ON_ONCE(scmnd->request->tag < 0);
-- 
2.1.2

#!/bin/awk -f

BEGIN {
  "ls -1d /sys/devices/system/node/node* 2>&1 | wc -l" | getline nodes
  if (nodes > 1) {
    for (i = 0; i < nodes; i++) {
      cpus_per_node = 0
      while (("cd /sys/devices/system/cpu && ls -d cpu*/node" i " | sed 's/^cpu//;s,/.*,,'|sort -n" | getline j) > 0) {
        #print "[" i ", " cpus_per_node "]: " j
        cpu[i, cpus_per_node++] = j
      }
    }
  } else {
      cpus_per_node = 0
      while (("cd /sys/devices/system/cpu && ls -d cpu[0-9]* | sed 's/^cpu//'|sort -n" | getline j) > 0) {
        #print "[0, " cpus_per_node "]: " j
        cpu[0, cpus_per_node++] = j
      }
  }
  for (i = 0; i < nodes; i++)
      nextcpu[i] = 0
  while (("sed -n 's/.*mlx4-ib-\\([0-9]*\\)-[0-9]*@\\(.*\\)$/\\1 \\2/p' /proc/interrupts | uniq" | getline) > 0) {
    port = $1
    bus = substr($0, length($1) + 2)
    #print "port = " port "; bus = " bus
    irqcount = 0
    while (("sed -n 's/^[[:blank:]]*\\([0-9]*\\):[0-9[:blank:]]*[^[:blank:]]*[[:blank:]]*\\(mlx4-ib-" port "-[0-9]*@" bus "\\)$/\\1 \\2/p' </proc/interrupts" | getline) > 0) {
      irq[irqcount] = $1
      irqname[irqcount] = substr($0, length($1) + 2)
      irqcount++
    }
    for (i = 0; i < nodes; i++) {
      ch_start = i * irqcount / nodes
      ch_end = (i + 1) * irqcount / nodes
      for (ch = ch_start; ch < ch_end; ch++) {
        c = cpu[i, nextcpu[i]++ % cpus_per_node]
        if (nodes > 1)
            nodetxt =  " (node " i ")"
        else
            nodetxt = ""
        print "IRQ " irq[ch] " (" irqname[ch] "): CPU " c nodetxt
        cmd="echo " c " >/proc/irq/" irq[ch] "/smp_affinity_list"
	#print cmd
	system(cmd)
      }
    }
  }
  exit 0
}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux