Re: [PATCH, RESEND] Avoid that SCSI device removal through sysfs triggers a deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2018-07-26 at 07:14 -0700, tj+AEA-kernel.org wrote:
+AD4- Hello,
+AD4- 
+AD4- On Thu, Jul 26, 2018 at 02:09:41PM +-0000, Bart Van Assche wrote:
+AD4- +AD4- On Thu, 2018-07-26 at 06:35 -0700, Tejun Heo wrote:
+AD4- +AD4- +AD4- Making removal asynchronous this way sometimes causes issues because
+AD4- +AD4- +AD4- whether the user sees the device released or not is racy.
+AD4- +AD4- +AD4- kernfs/sysfs have mechanisms to deal with these cases - remove+AF8-self
+AD4- +AD4- +AD4- and kernfs+AF8-break+AF8-active+AF8-protection().  Have you looked at those?
+AD4- +AD4- 
+AD4- +AD4- Hello Tejun,
+AD4- +AD4- 
+AD4- +AD4- The call stack in the patch description shows that sdev+AF8-store+AF8-delete() is
+AD4- +AD4- involved in the deadlock. The implementation of that function is as follows:
+AD4- +AD4- 
+AD4- +AD4- static ssize+AF8-t
+AD4- +AD4- sdev+AF8-store+AF8-delete(struct device +ACo-dev, struct device+AF8-attribute +ACo-attr,
+AD4- +AD4- 		  const char +ACo-buf, size+AF8-t count)
+AD4- +AD4- +AHs-
+AD4- +AD4- 	if (device+AF8-remove+AF8-file+AF8-self(dev, attr))
+AD4- +AD4- 		scsi+AF8-remove+AF8-device(to+AF8-scsi+AF8-device(dev))+ADs-
+AD4- +AD4- 	return count+ADs-
+AD4- +AD4- +AH0AOw-
+AD4- +AD4- 
+AD4- +AD4- device+AF8-remove+AF8-file+AF8-self() calls sysfs+AF8-remove+AF8-file+AF8-self() and that last
+AD4- +AD4- function calls kernfs+AF8-remove+AF8-self(). In other words, kernfs+AF8-remove+AF8-self()
+AD4- +AD4- is already being used. Please let me know if I misunderstood your comment.
+AD4- 
+AD4- So, here, because scsi+AF8-remove+AF8-device() is the one involved in the
+AD4- circular dependency, just breaking the dependency chain on the file
+AD4- itself (self removal) isn't enough.  You can wrap the whole operation
+AD4- with kernfs+AF8-break+AF8-active+AF8-protection() to also move
+AD4- scsi+AF8-remove+AF8-device() invocation outside the kernfs synchronization.
+AD4- This will need to be piped through sysfs but shouldn't be too complex.

Hello Tejun,

I have tried to implement the above but my implementation triggered a kernel
warning. Can you have a look at the attached patches and see what I did wrong?

Thanks,

Bart.

The kernel warning I ran into is as follows:

kernfs+AF8-put: 6:0:0:0/delete: released with incorrect active+AF8-ref 2147483647
WARNING: CPU: 6 PID: 1014 at fs/kernfs/dir.c:527 kernfs+AF8-put+-0x136/0x180
Hardware name: QEMU Standard PC (Q35 +- ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:kernfs+AF8-put+-0x136/0x180
Call Trace:
 evict+-0xc1/0x190
 +AF8AXw-dentry+AF8-kill+-0xc4/0x150
 shrink+AF8-dentry+AF8-list+-0x9e/0x1c0
 shrink+AF8-dcache+AF8-parent+-0x63/0x70
 d+AF8-invalidate+-0x42/0xb0
 lookup+AF8-fast+-0x278/0x2a0
 walk+AF8-component+-0x38/0x450
 link+AF8-path+AF8-walk+-0x2a8/0x4f0
 path+AF8-openat+-0x89/0x13a0
 do+AF8-filp+AF8-open+-0x8c/0xf0
 do+AF8-sys+AF8-open+-0x1a6/0x230
 do+AF8-syscall+AF8-64+-0x4f/0x110
 entry+AF8-SYSCALL+AF8-64+AF8-after+AF8-hwframe+-0x44/0xa9
From f280e1cb31eda63c47db02574dfdfaee8a3f1dbc Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxx>
Date: Thu, 26 Jul 2018 09:23:08 -0700
Subject: [PATCH 1/3] fs/sysfs: Introduce sysfs_remove_file_self_w_cb()

This patch makes it possible to execute more code than only the
__kernfs_remove() call while the active protection is dropped.

Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>
---
 fs/sysfs/file.c       | 17 ++++++++++++++++-
 include/linux/sysfs.h | 16 ++++++++++++++++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 5c13f29bfcdb..db9386070571 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -429,7 +429,12 @@ EXPORT_SYMBOL_GPL(sysfs_remove_file_ns);
  *
  * See kernfs_remove_self() for details.
  */
-bool sysfs_remove_file_self(struct kobject *kobj, const struct attribute *attr)
+bool sysfs_remove_file_self_w_cb(struct kobject *kobj,
+				 const struct attribute *attr,
+				 void (*cb)(struct kobject *kobj,
+					    const struct attribute *attr,
+					    void *data),
+				 void *data)
 {
 	struct kernfs_node *parent = kobj->sd;
 	struct kernfs_node *kn;
@@ -440,11 +445,21 @@ bool sysfs_remove_file_self(struct kobject *kobj, const struct attribute *attr)
 		return false;
 
 	ret = kernfs_remove_self(kn);
+	if (ret && cb) {
+		kernfs_break_active_protection(kn);
+		cb(kobj, attr, data);
+		kernfs_break_active_protection(kn);
+	}
 
 	kernfs_put(kn);
 	return ret;
 }
 
+bool sysfs_remove_file_self(struct kobject *kobj, const struct attribute *attr)
+{
+	return sysfs_remove_file_self_w_cb(kobj, attr, NULL, NULL);
+}
+
 void sysfs_remove_files(struct kobject *kobj, const struct attribute **ptr)
 {
 	int i;
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index b8bfdc173ec0..3c954d677736 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -239,6 +239,12 @@ int __must_check sysfs_chmod_file(struct kobject *kobj,
 				  const struct attribute *attr, umode_t mode);
 void sysfs_remove_file_ns(struct kobject *kobj, const struct attribute *attr,
 			  const void *ns);
+bool sysfs_remove_file_self_w_cb(struct kobject *kobj,
+				 const struct attribute *attr,
+				 void (*cb)(struct kobject *kobj,
+					    const struct attribute *attr,
+					    void *),
+				 void *data);
 bool sysfs_remove_file_self(struct kobject *kobj, const struct attribute *attr);
 void sysfs_remove_files(struct kobject *kobj, const struct attribute **attr);
 
@@ -356,6 +362,16 @@ static inline void sysfs_remove_file_ns(struct kobject *kobj,
 {
 }
 
+static inline bool sysfs_remove_file_self_w_cb(struct kobject *kobj,
+				const struct attribute *attr,
+				void (*cb)(struct kobject *kobj,
+					   const struct attribute *attr,
+					   void *),
+				void *data)
+{
+	return false;
+}
+
 static inline bool sysfs_remove_file_self(struct kobject *kobj,
 					  const struct attribute *attr)
 {
-- 
2.18.0

From 0a015f917189e633d027051b004f194950141747 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxx>
Date: Thu, 26 Jul 2018 09:09:29 -0700
Subject: [PATCH 2/3] drivers/base: Introduce device_remove_file_self_w_cb()

This patch makes it possible to execute more code than only the
__kernfs_remove() call while the active protection is dropped.

Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>
---
 drivers/base/core.c    | 41 +++++++++++++++++++++++++++++++++++++++++
 include/linux/device.h |  6 ++++++
 2 files changed, 47 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 36622b52e419..5458efc2e087 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1341,6 +1341,47 @@ void device_remove_file(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(device_remove_file);
 
+struct drfs_data {
+	void (*cb)(struct device *, const struct device_attribute *,
+		   void *data);
+	void *data;
+};
+
+static void drfs_cb(struct kobject *kobj, const struct attribute *attr,
+		   void *data)
+{
+	struct drfs_data *drfsd = data;
+	struct device *dev = container_of(kobj, typeof(*dev), kobj);
+	struct device_attribute *dev_attr =
+		container_of(attr, typeof(*dev_attr), attr);
+
+	drfsd->cb(dev, dev_attr, drfsd->data);
+}
+
+/**
+ * device_remove_file_self_w_cb - remove sysfs attribute file from its own method.
+ * @dev: device.
+ * @attr: device attribute descriptor.
+ * @cb: callback function to be called from the context that removes the
+ *      attribute file.
+ * @data: third argument for @cb.
+ *
+ * See kernfs_remove_self() for details.
+ */
+void device_remove_file_self_w_cb(struct device *dev,
+				  const struct device_attribute *attr,
+				  void (*cb)(struct device *,
+					     const struct device_attribute *,
+					     void *data),
+				  void *data)
+{
+	struct drfs_data drfs_data = { cb, data };
+
+	sysfs_remove_file_self_w_cb(&dev->kobj, &attr->attr, drfs_cb,
+				    &drfs_data);
+}
+EXPORT_SYMBOL_GPL(device_remove_file_self_w_cb);
+
 /**
  * device_remove_file_self - remove sysfs attribute file from its own method.
  * @dev: device.
diff --git a/include/linux/device.h b/include/linux/device.h
index 055a69dbcd18..182b1b05e8ac 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -608,6 +608,12 @@ extern int device_create_file(struct device *device,
 			      const struct device_attribute *entry);
 extern void device_remove_file(struct device *dev,
 			       const struct device_attribute *attr);
+extern void device_remove_file_self_w_cb(struct device *dev,
+				const struct device_attribute *attr,
+				void (*cb)(struct device *dev,
+					   const struct device_attribute *attr,
+					   void *data),
+				void *data);
 extern bool device_remove_file_self(struct device *dev,
 				    const struct device_attribute *attr);
 extern int __must_check device_create_bin_file(struct device *dev,
-- 
2.18.0

From ab0e46bbeebefa3a1e1f5194858a6b0e6091809d Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@xxxxxxx>
Date: Thu, 26 Jul 2018 09:30:14 -0700
Subject: [PATCH 3/3] Avoid that SCSI device removal through sysfs triggers a
 deadlock

This patch avoids that self-removal triggers the following deadlock:

======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc2-dbg+ #5 Not tainted
------------------------------------------------------
modprobe/6539 is trying to acquire lock:
000000008323c4cd (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x45/0x90

but task is already holding lock:
00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&shost->scan_mutex){+.+.}:
       __mutex_lock+0xfe/0xc70
       mutex_lock_nested+0x1b/0x20
       scsi_remove_device+0x26/0x40 [scsi_mod]
       sdev_store_delete+0x27/0x30 [scsi_mod]
       dev_attr_store+0x3e/0x50
       sysfs_kf_write+0x87/0xa0
       kernfs_fop_write+0x190/0x230
       __vfs_write+0xd2/0x3b0
       vfs_write+0x101/0x270
       ksys_write+0xab/0x120
       __x64_sys_write+0x43/0x50
       do_syscall_64+0x77/0x230
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (kn->count#202){++++}:
       lock_acquire+0xd2/0x260
       __kernfs_remove+0x424/0x4a0
       kernfs_remove_by_name_ns+0x45/0x90
       remove_files.isra.1+0x3a/0x90
       sysfs_remove_group+0x5c/0xc0
       sysfs_remove_groups+0x39/0x60
       device_remove_attrs+0x82/0xb0
       device_del+0x251/0x580
       __scsi_remove_device+0x19f/0x1d0 [scsi_mod]
       scsi_forget_host+0x37/0xb0 [scsi_mod]
       scsi_remove_host+0x9b/0x150 [scsi_mod]
       sdebug_driver_remove+0x4b/0x150 [scsi_debug]
       device_release_driver_internal+0x241/0x360
       device_release_driver+0x12/0x20
       bus_remove_device+0x1bc/0x290
       device_del+0x259/0x580
       device_unregister+0x1a/0x70
       sdebug_remove_adapter+0x8b/0xf0 [scsi_debug]
       scsi_debug_exit+0x76/0xe8 [scsi_debug]
       __x64_sys_delete_module+0x1c1/0x280
       do_syscall_64+0x77/0x230
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&shost->scan_mutex);
                               lock(kn->count#202);
                               lock(&shost->scan_mutex);
  lock(kn->count#202);

 *** DEADLOCK ***

2 locks held by modprobe/6539:
 #0: 00000000efaf9298 (&dev->mutex){....}, at: device_release_driver_internal+0x68/0x360
 #1: 00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]

stack backtrace:
CPU: 10 PID: 6539 Comm: modprobe Not tainted 4.18.0-rc2-dbg+ #5
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
 dump_stack+0xa4/0xf5
 print_circular_bug.isra.34+0x213/0x221
 __lock_acquire+0x1a7e/0x1b50
 lock_acquire+0xd2/0x260
 __kernfs_remove+0x424/0x4a0
 kernfs_remove_by_name_ns+0x45/0x90
 remove_files.isra.1+0x3a/0x90
 sysfs_remove_group+0x5c/0xc0
 sysfs_remove_groups+0x39/0x60
 device_remove_attrs+0x82/0xb0
 device_del+0x251/0x580
 __scsi_remove_device+0x19f/0x1d0 [scsi_mod]
 scsi_forget_host+0x37/0xb0 [scsi_mod]
 scsi_remove_host+0x9b/0x150 [scsi_mod]
 sdebug_driver_remove+0x4b/0x150 [scsi_debug]
 device_release_driver_internal+0x241/0x360
 device_release_driver+0x12/0x20
 bus_remove_device+0x1bc/0x290
 device_del+0x259/0x580
 device_unregister+0x1a/0x70
 sdebug_remove_adapter+0x8b/0xf0 [scsi_debug]
 scsi_debug_exit+0x76/0xe8 [scsi_debug]
 __x64_sys_delete_module+0x1c1/0x280
 do_syscall_64+0x77/0x230
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

See also https://www.mail-archive.com/linux-scsi@xxxxxxxxxxxxxxx/msg54525.html.

Fixes: ac0ece9174ac ("scsi: use device_remove_file_self() instead of device_schedule_callback()")
Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxxx>
Cc: Johannes Thumshirn <jthumshirn@xxxxxxx>
---
 drivers/scsi/scsi_sysfs.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index de122354d09a..944635586769 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -718,12 +718,18 @@ store_rescan_field (struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR(rescan, S_IWUSR, NULL, store_rescan_field);
 
+static void scsi_remove_device_cb(struct device *dev,
+				  const struct device_attribute *attr,
+				  void *data)
+{
+	scsi_remove_device(to_scsi_device(dev));
+}
+
 static ssize_t
 sdev_store_delete(struct device *dev, struct device_attribute *attr,
 		  const char *buf, size_t count)
 {
-	if (device_remove_file_self(dev, attr))
-		scsi_remove_device(to_scsi_device(dev));
+	device_remove_file_self_w_cb(dev, attr, scsi_remove_device_cb, NULL);
 	return count;
 };
 static DEVICE_ATTR(delete, S_IWUSR, NULL, sdev_store_delete);
-- 
2.18.0


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux