Re: [PATCH] driver core: Fix uevent_show() vs driver detach race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12.07.2024 21:42, Dan Williams wrote:
uevent_show() wants to de-reference dev->driver->name. There is no clean
way for a device attribute to de-reference dev->driver unless that
attribute is defined via (struct device_driver).dev_groups. Instead, the
anti-pattern of taking the device_lock() in the attribute handler risks
deadlocks with code paths that remove device attributes while holding
the lock.

This deadlock is typically invisible to lockdep given the device_lock()
is marked lockdep_set_novalidate_class(), but some subsystems allocate a
local lockdep key for @dev->mutex to reveal reports of the form:

  ======================================================
  WARNING: possible circular locking dependency detected
  6.10.0-rc7+ #275 Tainted: G           OE    N
  ------------------------------------------------------
  modprobe/2374 is trying to acquire lock:
  ffff8c2270070de0 (kn->active#6){++++}-{0:0}, at: __kernfs_remove+0xde/0x220

  but task is already holding lock:
  ffff8c22016e88f8 (&cxl_root_key){+.+.}-{3:3}, at: device_release_driver_internal+0x39/0x210

  which lock already depends on the new lock.


  the existing dependency chain (in reverse order) is:

  -> #1 (&cxl_root_key){+.+.}-{3:3}:
         __mutex_lock+0x99/0xc30
         uevent_show+0xac/0x130
         dev_attr_show+0x18/0x40
         sysfs_kf_seq_show+0xac/0xf0
         seq_read_iter+0x110/0x450
         vfs_read+0x25b/0x340
         ksys_read+0x67/0xf0
         do_syscall_64+0x75/0x190
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

  -> #0 (kn->active#6){++++}-{0:0}:
         __lock_acquire+0x121a/0x1fa0
         lock_acquire+0xd6/0x2e0
         kernfs_drain+0x1e9/0x200
         __kernfs_remove+0xde/0x220
         kernfs_remove_by_name_ns+0x5e/0xa0
         device_del+0x168/0x410
         device_unregister+0x13/0x60
         devres_release_all+0xb8/0x110
         device_unbind_cleanup+0xe/0x70
         device_release_driver_internal+0x1c7/0x210
         driver_detach+0x47/0x90
         bus_remove_driver+0x6c/0xf0
         cxl_acpi_exit+0xc/0x11 [cxl_acpi]
         __do_sys_delete_module.isra.0+0x181/0x260
         do_syscall_64+0x75/0x190
         entry_SYSCALL_64_after_hwframe+0x76/0x7e

The observation though is that driver objects are typically much longer
lived than device objects. It is reasonable to perform lockless
de-reference of a @driver pointer even if it is racing detach from a
device. Given the infrequency of driver unregistration, use
synchronize_rcu() in module_remove_driver() to close any potential
races.  It is potentially overkill to suffer synchronize_rcu() just to
handle the rare module removal racing uevent_show() event.

Thanks to Tetsuo Handa for the debug analysis of the syzbot report [1].

Fixes: c0a40097f0bc ("drivers: core: synchronize really_probe() and dev_uevent()")
Reported-by: syzbot+4762dd74e32532cda5ff@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Closes: http://lore.kernel.org/5aa5558f-90a4-4864-b1b1-5d6784c5607d@xxxxxxxxxxxxxxxxxxx [1]
Link: http://lore.kernel.org/669073b8ea479_5fffa294c1@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch
Cc: stable@xxxxxxxxxxxxxxx
Cc: Ashish Sangwan <a.sangwan@xxxxxxxxxxx>
Cc: Namjae Jeon <namjae.jeon@xxxxxxxxxxx>
Cc: Dirk Behme <dirk.behme@xxxxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>


Many thanks for this fix! Looks good to me:

Acked-by: Dirk Behme <dirk.behme@xxxxxxxxxxxx>

Dirk


---
  drivers/base/core.c   |   13 ++++++++-----
  drivers/base/module.c |    4 ++++
  2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2b4c0624b704..b5399262198a 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -25,6 +25,7 @@
  #include <linux/mutex.h>
  #include <linux/pm_runtime.h>
  #include <linux/netdevice.h>
+#include <linux/rcupdate.h>
  #include <linux/sched/signal.h>
  #include <linux/sched/mm.h>
  #include <linux/string_helpers.h>
@@ -2640,6 +2641,7 @@ static const char *dev_uevent_name(const struct kobject *kobj)
  static int dev_uevent(const struct kobject *kobj, struct kobj_uevent_env *env)
  {
  	const struct device *dev = kobj_to_dev(kobj);
+	struct device_driver *driver;
  	int retval = 0;
/* add device node properties if present */
@@ -2668,8 +2670,12 @@ static int dev_uevent(const struct kobject *kobj, struct kobj_uevent_env *env)
  	if (dev->type && dev->type->name)
  		add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
- if (dev->driver)
-		add_uevent_var(env, "DRIVER=%s", dev->driver->name);
+	/* Synchronize with module_remove_driver() */
+	rcu_read_lock();
+	driver = READ_ONCE(dev->driver);
+	if (driver)
+		add_uevent_var(env, "DRIVER=%s", driver->name);
+	rcu_read_unlock();
/* Add common DT information about the device */
  	of_device_uevent(dev, env);
@@ -2739,11 +2745,8 @@ static ssize_t uevent_show(struct device *dev, struct device_attribute *attr,
  	if (!env)
  		return -ENOMEM;
- /* Synchronize with really_probe() */
-	device_lock(dev);
  	/* let the kset specific function add its keys */
  	retval = kset->uevent_ops->uevent(&dev->kobj, env);
-	device_unlock(dev);
  	if (retval)
  		goto out;
diff --git a/drivers/base/module.c b/drivers/base/module.c
index a1b55da07127..b0b79b9c189d 100644
--- a/drivers/base/module.c
+++ b/drivers/base/module.c
@@ -7,6 +7,7 @@
  #include <linux/errno.h>
  #include <linux/slab.h>
  #include <linux/string.h>
+#include <linux/rcupdate.h>
  #include "base.h"
static char *make_driver_name(struct device_driver *drv)
@@ -97,6 +98,9 @@ void module_remove_driver(struct device_driver *drv)
  	if (!drv)
  		return;
+ /* Synchronize with dev_uevent() */
+	synchronize_rcu();
+
  	sysfs_remove_link(&drv->p->kobj, "module");
if (drv->owner)



--
======================================================================
Dirk Behme                      Robert Bosch Car Multimedia GmbH
                                CM/ESO2
Phone: +49 5121 49-3274         Dirk Behme
Fax:   +49 711 811 5053274      PO Box 77 77 77
mailto:dirk.behme@xxxxxxxxxxxx  D-31132 Hildesheim - Germany

Bosch Group, Car Multimedia (CM)
             Engineering SW Operating Systems 2 (ESO2)

Robert Bosch Car Multimedia GmbH - Ein Unternehmen der Bosch Gruppe
Sitz: Hildesheim
Registergericht: Amtsgericht Hildesheim HRB 201334
Aufsichtsratsvorsitzender: Dr. Dirk Hoheisel
Geschäftsführung: Dr. Steffen Berns;
                  Dr. Sven Ost, Jörg Pollak, Dr. Walter Schirm
======================================================================





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux