On Tue, Dec 12, 2017 at 2:06 PM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote: > On Fri, Dec 01, 2017 at 03:29:01AM -0800, syzbot wrote: >> Hello, >> >> syzkaller hit the following crash on >> df8ba95c572a187ed2aa7403e97a7a7f58c01f00 >> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master >> compiler: gcc (GCC) 7.1.1 20170620 >> .config is attached >> Raw console output is attached. >> >> Unfortunately, I don't have any reproducer for this bug yet. >> >> >> >> ====================================================== >> WARNING: possible circular locking dependency detected >> 4.15.0-rc1+ #202 Not tainted >> ------------------------------------------------------ >> syz-executor4/26476 is trying to acquire lock: >> (&p->lock){+.+.}, at: [<0000000040185b66>] seq_read+0xd5/0x13d0 >> fs/seq_file.c:165 >> >> but task is already holding lock: >> (&pipe->mutex/1){+.+.}, at: [<00000000c644bcdc>] pipe_lock_nested >> fs/pipe.c:67 [inline] >> (&pipe->mutex/1){+.+.}, at: [<00000000c644bcdc>] >> pipe_lock+0x56/0x70 fs/pipe.c:75 >> >> which lock already depends on the new lock. >> >> >> the existing dependency chain (in reverse order) is: >> >> -> #2 (&pipe->mutex/1){+.+.}: >> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004 >> __mutex_lock_common kernel/locking/mutex.c:756 [inline] >> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 >> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 >> __pipe_lock fs/pipe.c:88 [inline] >> fifo_open+0x15c/0xa40 fs/pipe.c:916 >> do_dentry_open+0x682/0xd70 fs/open.c:752 >> vfs_open+0x107/0x230 fs/open.c:866 >> do_last fs/namei.c:3379 [inline] >> path_openat+0x1157/0x3530 fs/namei.c:3519 >> do_filp_open+0x25b/0x3b0 fs/namei.c:3554 >> do_open_execat+0x1b9/0x5c0 fs/exec.c:849 >> do_execveat_common.isra.30+0x90c/0x23c0 fs/exec.c:1741 >> do_execveat fs/exec.c:1859 [inline] >> SYSC_execveat fs/exec.c:1940 [inline] >> SyS_execveat+0x4f/0x60 fs/exec.c:1932 >> do_syscall_64+0x26c/0x920 arch/x86/entry/common.c:285 >> return_from_SYSCALL_64+0x0/0x75 >> >> -> #1 (&sig->cred_guard_mutex){+.+.}: >> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004 >> __mutex_lock_common kernel/locking/mutex.c:756 [inline] >> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 >> mutex_lock_killable_nested+0x16/0x20 kernel/locking/mutex.c:923 >> do_io_accounting+0x1c2/0xf50 fs/proc/base.c:2682 >> proc_tid_io_accounting+0x1f/0x30 fs/proc/base.c:2725 >> proc_single_show+0xf8/0x170 fs/proc/base.c:744 >> seq_read+0x385/0x13d0 fs/seq_file.c:234 >> __vfs_read+0xef/0xa00 fs/read_write.c:411 >> vfs_read+0x124/0x360 fs/read_write.c:447 >> SYSC_read fs/read_write.c:573 [inline] >> SyS_read+0xef/0x220 fs/read_write.c:566 >> entry_SYSCALL_64_fastpath+0x1f/0x96 >> > > So the problem with all these deadlocks involving pipe->mutex and > sig->cred_guard_mutex is that execve() ranks pipe->mutex below > sig->cred_guard_mutex when it tries to open a fifo, whereas reading or writing > some of the /proc files result in ->cred_guard_mutex being taken which may be > underneath pipe->mutex from splice(). Here's a program which causes an actual > deadlock using this bug (in addition to reproducing the lockdep report): > > #define _GNU_SOURCE > #include <fcntl.h> > #include <pthread.h> > #include <sys/stat.h> > #include <unistd.h> > > static void *exec_thread(void *_arg) > { > for (;;) > execl("fifo", "fifo", NULL); > } > > int main() > { > int readend, writeend; > int syscallfd; > pthread_t t; > > mknod("fifo", 0777|S_IFIFO, 0); > readend = open("fifo", O_RDONLY|O_NONBLOCK); > writeend = open("fifo", O_WRONLY); > syscallfd = open("/proc/self/syscall", O_RDONLY); > > pthread_create(&t, NULL, exec_thread, NULL); > > for (;;) { > char buffer[16]; > loff_t off_in = 0; > splice(syscallfd, &off_in, writeend, NULL, 16, 0); > read(readend, buffer, 16); > } > } > > I'm not sure what the fix will be. Maybe the proc handlers should take a > different lock instead of cred_guard_mutex. Or perhaps execve should check that > the file is a regular file before it attempts to open it. This cleaner reproducer still generates the lockdep warning (but I can ctrl-C out of it without leaving behind a zombie), but I see that syzbot isn't seeing this any more. Why did it stop? (And can we feed a reproducer in to syzbot?) Was this creating an uninterruptible deadlock before? (Perhaps something did change here?) -Kees -- Kees Cook Pixel Security