Re: possible circular locking dependency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 21, 2009 at 04:00:50PM +0200, Christof Schmitt wrote:
> The lock dependency checker found this circular lock dependency
> warning on the 2.6.31 kernel plus some s390 patches. But the problem
> occurs in common SCSI code in 5 steps:
> 
> #4 first acquires scan_mutex in scsi_remove_device,
>    then sd_ref_mutex in scsi_disk_get_from_dev
> 
> #3 first acquires rport_delete_work in run_workqueue (inlined in worker_thread),
>    then scan_mutex in scsi_remove_device
> 
> #2 first acquires fc_host->work_q in run_workqueue,
>    then rport_delete_work also in run_workqueue
> 
> #1 first acquires cpu_add_remove_lock in destroy_workqueue,
>    then fc_host->work_q in cleanup_workqueue_thread
> 
> #0 first acquires sd_ref_mutex in scsi_disk_put,
>    then cpu_add_remove_lock in destroy_workqueue
> 
> I think this is only a theoretical warning which will be very hard or
> impossible to trigger in reality. But at least the warning should be
> fixed to keep the lock dependency checker useful.
> 
> Does anybody have an idea how to break this dependency chain?

This still happens with 2.6.32. I think it boils down to:

#4: The work function acquiring the sd_ref_mutex gives:
    cpu_add_remove_lock -> sd_ref_mutex

#0: Calling destroy_workqueue from scsi_host_dev_release introduces
    the dependency
    sd_ref_mutex -> cpu_add_remove_lock

But the sd_ref_mutex is required for the scsi_disk references. So far,
i don't see a good way to approach this.

> 
> The complete output of the lock dependency checker:
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31 #12
> -------------------------------------------------------
> multipathd/2285 is trying to acquire lock:
>  (cpu_add_remove_lock){+.+.+.}, at: [<000000000006a38e>] destroy_workqueue+0x3a/0x274
> 
> but task is already holding lock:
>  (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #4 (sd_ref_mutex){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<0000000000284190>] scsi_disk_get_from_dev+0x30/0x6c
>        [<0000000000284830>] sd_shutdown+0x28/0x160
>        [<0000000000284ca4>] sd_remove+0x68/0xac
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<0000000000272650>] __scsi_remove_device+0x6c/0xb4
>        [<00000000002726da>] scsi_remove_device+0x42/0x54
>        [<00000000002727c6>] __scsi_remove_target+0xce/0x108
>        [<00000000002728ae>] __remove_child+0x3a/0x4c
>        [<0000000000253b0e>] device_for_each_child+0x72/0xbc
>        [<000000000027284e>] scsi_remove_target+0x4e/0x74
>        [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #3 (&shost->scan_mutex){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<00000000002726d0>] scsi_remove_device+0x38/0x54
>        [<00000000002727c6>] __scsi_remove_target+0xce/0x108
>        [<00000000002728ae>] __remove_child+0x3a/0x4c
>        [<0000000000253b0e>] device_for_each_child+0x72/0xbc
>        [<000000000027284e>] scsi_remove_target+0x4e/0x74
>        [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #2 (&rport->rport_delete_work){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<0000000000069eca>] worker_thread+0x256/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #1 ((fc_host->work_q_name)){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000006a2ae>] cleanup_workqueue_thread+0x62/0xac
>        [<000000000006a420>] destroy_workqueue+0xcc/0x274
>        [<0000000000279c4a>] fc_remove_host+0x1de/0x210
>        [<000000000034556e>] zfcp_adapter_scsi_unregister+0x96/0xc4
>        [<0000000000343df0>] zfcp_ccw_remove+0x9c/0x370
>        [<00000000002c2a6a>] ccw_device_remove+0x3e/0x1a8
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<00000000002c3404>] ccw_device_unregister+0x5c/0x7c
>        [<00000000002c3490>] io_subchannel_remove+0x6c/0x9c
>        [<00000000002be32e>] css_remove+0x3e/0x7c
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<000000000025466a>] device_unregister+0x26/0x38
>        [<00000000002be4bc>] css_sch_device_unregister+0x44/0x54
>        [<00000000002c435e>] ccw_device_call_sch_unregister+0x4e/0x78
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #0 (cpu_add_remove_lock){+.+.+.}:
>        [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<000000000006a38e>] destroy_workqueue+0x3a/0x274
>        [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<0000000000284216>] scsi_disk_put+0x4a/0x5c
>        [<0000000000285560>] sd_release+0x6c/0x108
>        [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
>        [<00000000000f224e>] __fput+0x12a/0x240
>        [<00000000000ee4c0>] filp_close+0x78/0xa8
>        [<00000000000ee5d0>] SyS_close+0xe0/0x148
>        [<000000000002a042>] sysc_noemu+0x10/0x16
>        [<0000020000041160>] 0x20000041160
> 
> other info that might help us debug this:
> 
> 2 locks held by multipathd/2285:
>  #0:  (&bdev->bd_mutex){+.+.+.}, at: [<00000000001261f2>] __blkdev_put+0x46/0x1cc
>  #1:  (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c
> 
> stack backtrace:
> CPU: 1 Not tainted 2.6.31 #12
> Process multipathd (pid: 2285, task: 000000002d87b900, ksp: 000000002eca7800)
> 0000000000000000 000000002eca7770 0000000000000002 0000000000000000 
>        000000002eca7810 000000002eca7788 000000002eca7788 000000000046db82 
>        0000000000000000 0000000000000001 000000002d87bfd0 0000000000000000 
>        000000000000000d 0000000000000000 000000002eca77d8 000000000000000e 
>        000000000047fc30 0000000000017d80 000000002eca7770 000000002eca77b8 
> Call Trace:
> ([<0000000000017c82>] show_trace+0xee/0x144)
>  [<000000000008532e>] print_circular_bug_tail+0x10a/0x110
>  [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
>  [<00000000000872dc>] lock_acquire+0x90/0xb8
>  [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>  [<000000000006a38e>] destroy_workqueue+0x3a/0x274
>  [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<0000000000284216>] scsi_disk_put+0x4a/0x5c
>  [<0000000000285560>] sd_release+0x6c/0x108
>  [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
>  [<00000000000f224e>] __fput+0x12a/0x240
>  [<00000000000ee4c0>] filp_close+0x78/0xa8
>  [<00000000000ee5d0>] SyS_close+0xe0/0x148
>  [<000000000002a042>] sysc_noemu+0x10/0x16
>  [<0000020000041160>] 0x20000041160
> INFO: lockdep is turned off.
> 
> --
> Christof Schmitt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux