On Fri, Sep 02, 2016 at 01:02:16PM -0400, CAI Qian wrote: > Spice seems start to deadlock using the reproducer, > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c > > This seems introduced recently after v4.8-rc3 or -rc4, so suspect this xfs update was one to blame, > > 7d1ce606a37922879cbe40a6122047827105a332 Nope, this goes back to the splice rework back around ~3.16, IIRC. > [ 1749.956818] > [ 1749.958492] ====================================================== > [ 1749.965386] [ INFO: possible circular locking dependency detected ] > [ 1749.972381] 4.8.0-rc4+ #34 Not tainted > [ 1749.976560] ------------------------------------------------------- > [ 1749.983554] splice01/35921 is trying to acquire lock: > [ 1749.989188] (&sb->s_type->i_mutex_key#14){+.+.+.}, at: [<ffffffffa083c1f7>] xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.001644] > [ 1750.001644] but task is already holding lock: > [ 1750.008151] (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8169e7c1>] pipe_lock+0x51/0x60 > [ 1750.016753] > [ 1750.016753] which lock already depends on the new lock. > [ 1750.016753] > [ 1750.025880] > [ 1750.025880] the existing dependency chain (in reverse order) is: > [ 1750.034229] > -> #2 (&pipe->mutex/1){+.+.+.}: > [ 1750.039139] [<ffffffff812af52a>] lock_acquire+0x1fa/0x440 > [ 1750.045857] [<ffffffff8266448d>] mutex_lock_nested+0xdd/0x850 > [ 1750.052963] [<ffffffff8169e7c1>] pipe_lock+0x51/0x60 > [ 1750.059190] [<ffffffff8171ee25>] splice_to_pipe+0x75/0x9e0 > [ 1750.066001] [<ffffffff81723991>] __generic_file_splice_read+0xa71/0xe90 > [ 1750.074071] [<ffffffff81723e71>] generic_file_splice_read+0xc1/0x1f0 > [ 1750.081849] [<ffffffffa0838628>] xfs_file_splice_read+0x368/0x7b0 [xfs] > [ 1750.089940] [<ffffffff8171fa7e>] do_splice_to+0xee/0x150 > [ 1750.096555] [<ffffffff817262f4>] SyS_splice+0x1144/0x1c10 > [ 1750.103269] [<ffffffff81007b66>] do_syscall_64+0x1a6/0x500 > [ 1750.110084] [<ffffffff8266ea7f>] return_from_SYSCALL_64+0x0/0x7a pipe_lock taken below the filesystem IO path, filesystem holds locks to protect against racing hole punch, etc... > [ 1750.188328] > -> #0 (&sb->s_type->i_mutex_key#14){+.+.+.}: > [ 1750.194508] [<ffffffff812adbc3>] __lock_acquire+0x3043/0x3dd0 > [ 1750.201609] [<ffffffff812af52a>] lock_acquire+0x1fa/0x440 > [ 1750.208321] [<ffffffff82668cda>] down_write+0x5a/0xe0 > [ 1750.214645] [<ffffffffa083c1f7>] xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.223421] [<ffffffffa083cb7d>] xfs_file_write_iter+0x26d/0x6d0 [xfs] > [ 1750.231423] [<ffffffff816859be>] vfs_iter_write+0x29e/0x550 > [ 1750.238330] [<ffffffff81722729>] iter_file_splice_write+0x529/0xb70 > [ 1750.246012] [<ffffffff817258d4>] SyS_splice+0x724/0x1c10 > [ 1750.252627] [<ffffffff81007b66>] do_syscall_64+0x1a6/0x500 > [ 1750.259438] [<ffffffff8266ea7f>] return_from_SYSCALL_64+0x0/0x7a pipe_lock taken above the filesystem IO path, filesystem tries to take locks to protect against racing hole punch, etc, lockdep goes boom. Fundamentally a splice infrastructure problem. If we let splice race with hole punch and other fallocate() based extent manipulations to avoid this lockdep warning, we allow potential for read or write to regions of the file that have been freed. We can live with having lockdep complain about this potential deadlock as it is unlikely to ever occur in practice. The other option is simply not an acceptible solution.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs