Hi all, For the last few kernel releases we've been seeing this pretty regularly: ====================================================== [ INFO: possible circular locking dependency detected ] 4.5.0-rc3-ceph-00790-g2364ac5 #1 Not tainted ------------------------------------------------------- tp_fstore_op/25647 is trying to acquire lock: (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs] #012but task is already holding lock: (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126347d>] pipe_lock+0x1d/0x20 #012which lock already depends on the new lock. #012the existing dependency chain (in reverse order) is: #012-> #2 (&pipe->mutex/1){+.+.+.}: [<ffffffff81100842>] lock_acquire+0x162/0x1b0 [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380 [<ffffffff8126347d>] pipe_lock+0x1d/0x20 [<ffffffff81290d20>] splice_to_pipe+0x30/0x250 [<ffffffff81292379>] __generic_file_splice_read+0x4f9/0x620 [<ffffffff812924de>] generic_file_splice_read+0x3e/0x70 [<ffffffffa05738d4>] xfs_file_splice_read+0x194/0x220 [xfs] [<ffffffff812912da>] do_splice_to+0x6a/0x80 [<ffffffff812934e1>] SyS_splice+0x6c1/0x720 [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76 #012-> #1 (&(&ip->i_iolock)->mr_lock){++++++}: [<ffffffff81100842>] lock_acquire+0x162/0x1b0 [<ffffffff810fa37f>] down_write_nested+0x4f/0x80 [<ffffffffa058106d>] xfs_ilock+0xed/0x1b0 [xfs] [<ffffffffa05801c0>] xfs_vn_setattr+0x30/0x70 [xfs] [<ffffffff8127a09d>] notify_change+0x1ed/0x320 [<ffffffff8125843c>] do_truncate+0x7c/0xc0 [<ffffffff81258843>] do_sys_ftruncate.constprop.12+0x153/0x170 [<ffffffff8125889e>] SyS_ftruncate+0xe/0x10 [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76 #012-> #0 (&sb->s_type->i_mutex_key#19){+.+.+.}: [<ffffffff810ff57e>] __lock_acquire+0x13be/0x1e30 [<ffffffff81100842>] lock_acquire+0x162/0x1b0 [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380 [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs] [<ffffffffa0574ed5>] xfs_file_write_iter+0x95/0x140 [xfs] [<ffffffff81259ee5>] vfs_iter_write+0x75/0xb0 [<ffffffff81291a51>] iter_file_splice_write+0x281/0x3a0 [<ffffffff812932fa>] SyS_splice+0x4da/0x720 [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76 #012other info that might help us debug this: Chain exists of:#012 &sb->s_type->i_mutex_key#19 --> &(&ip->i_iolock)->mr_lock --> &pipe->mutex/1 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&pipe->mutex/1); lock(&(&ip->i_iolock)->mr_lock); lock(&pipe->mutex/1); lock(&sb->s_type->i_mutex_key#19); #012 *** DEADLOCK *** 2 locks held by tp_fstore_op/25647: #0: (sb_writers#15){.+.+.+}, at: [<ffffffff8125db7d>] __sb_start_write+0xad/0xe0 #1: (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126347d>] pipe_lock+0x1d/0x20 #012stack backtrace: CPU: 4 PID: 25647 Comm: tp_fstore_op Not tainted 4.5.0-rc3-ceph-00790-g2364ac5 #1 Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015 0000000000000086 00000000547c0aa5 ffff880858c0bad8 ffffffff813f8661 ffffffff827127a0 ffffffff82711870 ffff880858c0bb18 ffffffff811c5d39 ffff880858c0bb70 ffff8808530288d8 0000000000000001 0000000000000002 Call Trace: [<ffffffff813f8661>] dump_stack+0x85/0xc4 [<ffffffff811c5d39>] print_circular_bug+0x200/0x20e [<ffffffff810ff57e>] __lock_acquire+0x13be/0x1e30 [<ffffffff81100842>] lock_acquire+0x162/0x1b0 [<ffffffffa0574bad>] ? xfs_file_buffered_aio_write+0x6d/0x300 [xfs] [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380 [<ffffffffa0574bad>] ? xfs_file_buffered_aio_write+0x6d/0x300 [xfs] [<ffffffff810d338d>] ? ttwu_stat+0x19d/0x200 [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs] [<ffffffff810fdc6f>] ? mark_held_locks+0x6f/0x80 [<ffffffff810fdde3>] ? trace_hardirqs_on_caller+0x163/0x1d0 [<ffffffffa0574ed5>] xfs_file_write_iter+0x95/0x140 [xfs] [<ffffffff81259ee5>] vfs_iter_write+0x75/0xb0 [<ffffffff81291a51>] iter_file_splice_write+0x281/0x3a0 [<ffffffff812932fa>] SyS_splice+0x4da/0x720 [<ffffffff810031d6>] ? do_audit_syscall_entry+0x66/0x70 [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76 We're not actually doing racing truncate and splice (or any racing operations for that matter) to the same files, so we don't actually hit this, but the lockdep warning is enough to make our test runs fail, and we'd rather not whitelist this if it's something that should get fixed anyway. My apologies for not raising this sooner; I thought I'd already emailed the list about this a few months ago but it looks think I'd only pinged #xfs. sage _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs