On Mon, Mar 21, 2022 at 5:17 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > On Mon, Mar 21, 2022 at 5:03 PM David Howells <dhowells@xxxxxxxxxx> wrote: > > Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > > The syz reproducer is: > > > > > > #{"threaded":true,"procs":1,"slowdown":1,"sandbox":"","close_fds":false} > > > pipe(&(0x7f0000000240)={<r0=>0xffffffffffffffff, <r1=>0xffffffffffffffff}) > > > pipe2(&(0x7f00000001c0)={0xffffffffffffffff, <r2=>0xffffffffffffffff}, 0x80) > > > splice(r0, 0x0, r2, 0x0, 0x1ff, 0x0) > > > vmsplice(r1, &(0x7f00000006c0)=[{&(0x7f0000000080)="b5", 0x1}], 0x1, 0x0) > > > > > > That 0x80 is O_NOTIFICATION_PIPE (==O_EXCL). > > > > > > It looks like the bug is that when you try to splice between a normal > > > pipe and a notification pipe, get_pipe_info(..., true) fails, so > > > splice() falls back to treating the notification pipe like a normal > > > pipe - so we end up in iter_file_splice_write(), which first locks the > > > input pipe, then calls vfs_iter_write(), which locks the output pipe. > > > > > > I think this probably (?) can't actually lead to deadlocks, since > > > you'd need another way to nest locking a normal pipe into locking a > > > watch_queue pipe, but the lockdep annotations don't make that clear. > > > > Is this then a bug/feature in iter_file_splice_write() rather than in the > > watch queue code, per se? > > I think at least when you call splice() on two normal pipes from > userspace, it'll never go through this codepath for real pipes, > because pipe-to-pipe splicing is special-cased? And sendfile() bails > out in that case because pipes don't have a .splice_read() handler. > > And with notification pipes, we don't take that special path in > splice(), and so we hit the lockdep warning. But I don't know whether > that makes it the fault of notification pipes... > > Maybe it would be enough to just move the "if (pipe->watch_queue)" > check in pipe_write() up above the __pipe_lock(pipe)? [coming back to this thread 1.5 years later...] I've turned that idea into a fix, let's have syzbot try it out before I submit the fix patch: #syz test: https://github.com/thejh/linux.git 56c486e68166