Patch "driver core: fix potential deadlock in __driver_attach" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    driver core: fix potential deadlock in __driver_attach

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     driver-core-fix-potential-deadlock-in-__driver_attac.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 06517056f2bd35e1d51d09902ff3b2049e65b6d9
Author: Zhang Wensheng <zhangwensheng5@xxxxxxxxxx>
Date:   Wed Jun 22 15:43:27 2022 +0800

    driver core: fix potential deadlock in __driver_attach
    
    [ Upstream commit 70fe758352cafdee72a7b13bf9db065f9613ced8 ]
    
    In __driver_attach function, There are also AA deadlock problem,
    like the commit b232b02bf3c2 ("driver core: fix deadlock in
    __device_attach").
    
    stack like commit b232b02bf3c2 ("driver core: fix deadlock in
    __device_attach").
    list below:
        In __driver_attach function, The lock holding logic is as follows:
        ...
        __driver_attach
        if (driver_allows_async_probing(drv))
          device_lock(dev)      // get lock dev
            async_schedule_dev(__driver_attach_async_helper, dev); // func
              async_schedule_node
                async_schedule_node_domain(func)
                  entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
                  /* when fail or work limit, sync to execute func, but
                     __driver_attach_async_helper will get lock dev as
                     will, which will lead to A-A deadlock.  */
                  if (!entry || atomic_read(&entry_count) > MAX_WORK) {
                    func;
                  else
                    queue_work_node(node, system_unbound_wq, &entry->work)
          device_unlock(dev)
    
        As above show, when it is allowed to do async probes, because of
        out of memory or work limit, async work is not be allowed, to do
        sync execute instead. it will lead to A-A deadlock because of
        __driver_attach_async_helper getting lock dev.
    
    Reproduce:
    and it can be reproduce by make the condition
    (if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
    below:
    
    [  370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
    this message.
    [  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
    0 flags:0x00004000
    [  370.788865] Call Trace:
    [  370.789374]  <TASK>
    [  370.789841]  __schedule+0x482/0x1050
    [  370.790613]  schedule+0x92/0x1a0
    [  370.791290]  schedule_preempt_disabled+0x2c/0x50
    [  370.792256]  __mutex_lock.isra.0+0x757/0xec0
    [  370.793158]  __mutex_lock_slowpath+0x1f/0x30
    [  370.794079]  mutex_lock+0x50/0x60
    [  370.794795]  __device_driver_lock+0x2f/0x70
    [  370.795677]  ? driver_probe_device+0xd0/0xd0
    [  370.796576]  __driver_attach_async_helper+0x1d/0xd0
    [  370.797318]  ? driver_probe_device+0xd0/0xd0
    [  370.797957]  async_schedule_node_domain+0xa5/0xc0
    [  370.798652]  async_schedule_node+0x19/0x30
    [  370.799243]  __driver_attach+0x246/0x290
    [  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
    [  370.800548]  bus_for_each_dev+0x9d/0x130
    [  370.801132]  driver_attach+0x22/0x30
    [  370.801666]  bus_add_driver+0x290/0x340
    [  370.802246]  driver_register+0x88/0x140
    [  370.802817]  ? virtio_scsi_init+0x116/0x116
    [  370.803425]  scsi_register_driver+0x1a/0x30
    [  370.804057]  init_sd+0x184/0x226
    [  370.804533]  do_one_initcall+0x71/0x3a0
    [  370.805107]  kernel_init_freeable+0x39a/0x43a
    [  370.805759]  ? rest_init+0x150/0x150
    [  370.806283]  kernel_init+0x26/0x230
    [  370.806799]  ret_from_fork+0x1f/0x30
    
    To fix the deadlock, move the async_schedule_dev outside device_lock,
    as we can see, in async_schedule_node_domain, the parameter of
    queue_work_node is system_unbound_wq, so it can accept concurrent
    operations. which will also not change the code logic, and will
    not lead to deadlock.
    
    Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver")
    Signed-off-by: Zhang Wensheng <zhangwensheng5@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@xxxxxxxxxx
    Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f9d9f1ad9215..b5441741274b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1056,6 +1056,7 @@ static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie)
 static int __driver_attach(struct device *dev, void *data)
 {
 	struct device_driver *drv = data;
+	bool async = false;
 	int ret;
 
 	/*
@@ -1093,9 +1094,11 @@ static int __driver_attach(struct device *dev, void *data)
 		if (!dev->driver) {
 			get_device(dev);
 			dev->p->async_driver = drv;
-			async_schedule_dev(__driver_attach_async_helper, dev);
+			async = true;
 		}
 		device_unlock(dev);
+		if (async)
+			async_schedule_dev(__driver_attach_async_helper, dev);
 		return 0;
 	}
 



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux