2012/9/5 Jeff Layton <jlayton@xxxxxxxxxx>: > I got the following lockdep pop when testing a few patches on top of > Pavel's for-next branch. The relevant mount options were: > > vers=1.0,cache=strict,nounix > > The test that caused it to pop was connectathon test5 (the read/write > test). > > I haven't yet done any investigation but it looks like a plausible > deadlock possibility. Also, fwiw -- the test7 regression that I > reported earlier seems to have gone away for now. No idea why. > > Anyway, here's the lockdep spew: > > [ 1732.096609] ====================================================== > [ 1732.096609] [ INFO: possible circular locking dependency detected ] > [ 1732.096609] 3.6.0-0.rc4.git0.1.fc18.x86_64.debug #1 Tainted: G O > [ 1732.096609] ------------------------------------------------------- > [ 1732.096609] test5/7374 is trying to acquire lock: > [ 1732.096609] (sb_writers#13){.+.+.+}, at: [<ffffffff8115edd3>] generic_file_aio_write+0x63/0x100 > [ 1732.096609] > [ 1732.096609] but task is already holding lock: > [ 1732.096609] (&cifsi->lock_sem){+++++.}, at: [<ffffffffa02b61fc>] cifs_strict_writev+0xcc/0x1b0 [cifs] > [ 1732.096609] > [ 1732.096609] which lock already depends on the new lock. > [ 1732.096609] > [ 1732.096609] > [ 1732.096609] the existing dependency chain (in reverse order) is: > [ 1732.096609] > [ 1732.096609] -> #1 (&cifsi->lock_sem){+++++.}: > [ 1732.096609] [<ffffffff810d5d91>] lock_acquire+0xa1/0x1f0 > [ 1732.096609] [<ffffffff816d8721>] down_write+0x61/0xb0 > [ 1732.096609] [<ffffffffa02b1c8d>] cifs_new_fileinfo+0xbd/0x2e0 [cifs] > [ 1732.096609] [<ffffffffa02b2217>] cifs_open+0x327/0x760 [cifs] > [ 1732.096609] [<ffffffff811cb923>] do_dentry_open+0x263/0x310 > [ 1732.096609] [<ffffffff811cba1b>] finish_open+0x4b/0x70 > [ 1732.096609] [<ffffffff811deace>] do_last+0x6de/0xe00 > [ 1732.096609] [<ffffffff811df2ad>] path_openat+0xbd/0x500 > [ 1732.096609] [<ffffffff811dfcc1>] do_filp_open+0x41/0xa0 > [ 1732.096609] [<ffffffff811ccdc6>] do_sys_open+0xf6/0x1e0 > [ 1732.096609] [<ffffffff811cced1>] sys_open+0x21/0x30 > [ 1732.096609] [<ffffffff816e42a9>] system_call_fastpath+0x16/0x1b > [ 1732.096609] > [ 1732.096609] -> #0 (sb_writers#13){.+.+.+}: > [ 1732.096609] [<ffffffff810d4f52>] __lock_acquire+0x1372/0x1ad0 > [ 1732.096609] [<ffffffff810d5d91>] lock_acquire+0xa1/0x1f0 > [ 1732.096609] [<ffffffff811d0b87>] __sb_start_write+0xd7/0x1d0 > [ 1732.096609] [<ffffffff8115edd3>] generic_file_aio_write+0x63/0x100 > [ 1732.096609] [<ffffffffa02b6299>] cifs_strict_writev+0x169/0x1b0 [cifs] > [ 1732.096609] [<ffffffff811cd137>] do_sync_write+0xa7/0xe0 > [ 1732.096609] [<ffffffff811cda2f>] vfs_write+0xaf/0x190 > [ 1732.096609] [<ffffffff811cdd6d>] sys_write+0x4d/0x90 > [ 1732.096609] [<ffffffff816e42a9>] system_call_fastpath+0x16/0x1b > [ 1732.096609] > [ 1732.096609] other info that might help us debug this: > [ 1732.096609] > [ 1732.096609] Possible unsafe locking scenario: > [ 1732.096609] > [ 1732.096609] CPU0 CPU1 > [ 1732.096609] ---- ---- > [ 1732.096609] lock(&cifsi->lock_sem); > [ 1732.096609] lock(sb_writers#13); > [ 1732.096609] lock(&cifsi->lock_sem); > [ 1732.096609] lock(sb_writers#13); > [ 1732.096609] > [ 1732.096609] *** DEADLOCK *** > [ 1732.096609] > [ 1732.096609] 1 lock held by test5/7374: > [ 1732.096609] #0: (&cifsi->lock_sem){+++++.}, at: [<ffffffffa02b61fc>] cifs_strict_writev+0xcc/0x1b0 [cifs] > [ 1732.096609] > [ 1732.096609] stack backtrace: > [ 1732.096609] Pid: 7374, comm: test5 Tainted: G O 3.6.0-0.rc4.git0.1.fc18.x86_64.debug #1 > [ 1732.096609] Call Trace: > [ 1732.096609] [<ffffffff816cf0b4>] print_circular_bug+0x1fb/0x20c > [ 1732.096609] [<ffffffff810d4f52>] __lock_acquire+0x1372/0x1ad0 > [ 1732.096609] [<ffffffff8104aee2>] ? kvm_clock_read+0x32/0x40 > [ 1732.096609] [<ffffffff81021e39>] ? sched_clock+0x9/0x10 > [ 1732.096609] [<ffffffff810ac295>] ? sched_clock_local+0x25/0xa0 > [ 1732.096609] [<ffffffff810d5d91>] lock_acquire+0xa1/0x1f0 > [ 1732.096609] [<ffffffff8115edd3>] ? generic_file_aio_write+0x63/0x100 > [ 1732.096609] [<ffffffff811d0b87>] __sb_start_write+0xd7/0x1d0 > [ 1732.096609] [<ffffffff8115edd3>] ? generic_file_aio_write+0x63/0x100 > [ 1732.096609] [<ffffffff8115edd3>] ? generic_file_aio_write+0x63/0x100 > [ 1732.096609] [<ffffffff8115edd3>] generic_file_aio_write+0x63/0x100 > [ 1732.096609] [<ffffffffa02b6299>] cifs_strict_writev+0x169/0x1b0 [cifs] > [ 1732.096609] [<ffffffff811cd137>] do_sync_write+0xa7/0xe0 > [ 1732.096609] [<ffffffff811cda2f>] vfs_write+0xaf/0x190 > [ 1732.096609] [<ffffffff811cdd6d>] sys_write+0x4d/0x90 > [ 1732.096609] [<ffffffff816e42a9>] system_call_fastpath+0x16/0x1b > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> I think we can replace generic_file_aio_write call here with something like cifs_generic_aio_write that is almost the same as generic_file_aio_write but with up_read(&cinode->lock_sem) after sb_start_write and down_read(&cinode->lock_sem) before sb_end_write - to make locking in the same order: 1) sb_writers 2) lock_sem thoughts? -- Best regards, Pavel Shilovsky. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html