Re: possible deadlock in ovl_write_iter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[CC: fsdevel]

On Thu, Sep 27, 2018 at 6:48 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Wed, Sep 26, 2018 at 11:44 PM syzbot
> <syzbot+695726bc473f9c36a4b6@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    a38523185b40 erge tag 'libnvdimm-fixes-4.19-rc6' of git://..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=178f63fa400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=dfb440e26f0a6f6f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=695726bc473f9c36a4b6
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+695726bc473f9c36a4b6@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > Process accounting resumed
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.19.0-rc5+ #32 Not tainted
> > ------------------------------------------------------
> > overlayfs: option "workdir=./file1\" is useless in a non-upper mount, ignore
> > syz-executor1/7265 is trying to acquire lock:
> > 00000000fec87ddb (&ovl_i_mutex_key[depth]){+.+.}, at: inode_lock
> > include/linux/fs.h:738 [inline]
> > 00000000fec87ddb (&ovl_i_mutex_key[depth]){+.+.}, at:
> > ovl_write_iter+0x151/0xb00 fs/overlayfs/file.c:231
> >
> > but task is already holding lock:
> > 00000000998db2f0 (&acct->lock#2){+.+.}, at: acct_get kernel/acct.c:161
> > [inline]
> > 00000000998db2f0 (&acct->lock#2){+.+.}, at: slow_acct_process
> > kernel/acct.c:577 [inline]
> > 00000000998db2f0 (&acct->lock#2){+.+.}, at: acct_process+0x48b/0x875
> > kernel/acct.c:605
> >
> > which lock already depends on the new lock.
> >
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #2 (&acct->lock#2){+.+.}:
> > overlayfs: at least 2 lowerdir are needed while upperdir nonexistent
> >         __mutex_lock_common kernel/locking/mutex.c:925 [inline]
> >         __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072
> > kobject: 'kvm' (00000000bb4f2ec2): kobject_uevent_env
> >         mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
> >         acct_pin_kill+0x26/0x100 kernel/acct.c:173
> >         pin_kill+0x29d/0xab0 fs/fs_pin.c:50
> > kobject: 'kvm' (00000000bb4f2ec2): fill_kobj_path: path
> > = '/devices/virtual/misc/kvm'
> >         acct_on+0x64a/0x8c0 kernel/acct.c:254
> >         __do_sys_acct kernel/acct.c:286 [inline]
> >         __se_sys_acct kernel/acct.c:273 [inline]
> >         __x64_sys_acct+0xc2/0x1f0 kernel/acct.c:273
> >         do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> >         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > -> #1 (sb_writers#3){.+.+}:
> >         percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36
> > [inline]
> >         percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
> >         __sb_start_write+0x214/0x370 fs/super.c:1387
> >         sb_start_write include/linux/fs.h:1566 [inline]
> >         mnt_want_write+0x3f/0xc0 fs/namespace.c:360
> >         ovl_want_write+0x76/0xa0 fs/overlayfs/util.c:24
> >         ovl_setattr+0x10b/0xaf0 fs/overlayfs/inode.c:30
> >         notify_change+0xbde/0x1110 fs/attr.c:334
> >         do_truncate+0x1bd/0x2d0 fs/open.c:63
> >         handle_truncate fs/namei.c:3008 [inline]
> >         do_last fs/namei.c:3424 [inline]
> >         path_openat+0x3762/0x5160 fs/namei.c:3534
> >         do_filp_open+0x255/0x380 fs/namei.c:3564
> > kobject: 'loop4' (0000000088f052bf): kobject_uevent_env
> >         do_sys_open+0x568/0x700 fs/open.c:1063
> >         ksys_open include/linux/syscalls.h:1276 [inline]
> >         __do_sys_creat fs/open.c:1121 [inline]
> >         __se_sys_creat fs/open.c:1119 [inline]
> >         __x64_sys_creat+0x61/0x80 fs/open.c:1119
> >         do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> > kobject: 'loop4' (0000000088f052bf): fill_kobj_path: path
> > = '/devices/virtual/block/loop4'
> >         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > -> #0
> > kobject: 'kvm' (00000000bb4f2ec2): kobject_uevent_env
> >   (&ovl_i_mutex_key[depth]){+.+.}:
> >         lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
> >         down_write+0x8a/0x130 kernel/locking/rwsem.c:70
> >         inode_lock include/linux/fs.h:738 [inline]
> >         ovl_write_iter+0x151/0xb00 fs/overlayfs/file.c:231
> > kobject: 'kvm' (00000000bb4f2ec2): fill_kobj_path: path
> > = '/devices/virtual/misc/kvm'
> >         call_write_iter include/linux/fs.h:1808 [inline]
> >         new_sync_write fs/read_write.c:474 [inline]
> >         __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
> >         __kernel_write+0x10c/0x370 fs/read_write.c:506
> >         do_acct_process+0x1144/0x1660 kernel/acct.c:520
> >         slow_acct_process kernel/acct.c:579 [inline]
> >         acct_process+0x6b1/0x875 kernel/acct.c:605
> >         do_exit+0x1aaf/0x2610 kernel/exit.c:857
> >         do_group_exit+0x177/0x440 kernel/exit.c:970
> >         get_signal+0x8b0/0x1980 kernel/signal.c:2513
> >         do_signal+0x9c/0x21e0 arch/x86/kernel/signal.c:816
> >         exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
> >         prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
> >         syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
> >         do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
> >         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > other info that might help us debug this:
> >
> > Chain exists of:
> >    &ovl_i_mutex_key[depth] --> sb_writers#3 --> &acct->lock#2
> >
>
> I am not sure this is the root cause, but there seems to be a locking order
> violation in code that replaces accounting file.
> If new file is on the same fs as old file, acct_pin_kill(old) will skip writing
> the file, because sb_writers is still already taken by acct_on().
> If new file is not on same fs as old file, this ordering violation creates
> an unneeded dependency between new_sb_writers and old_sb_writers,
> which may be later reported as possible deadlock.
>
> This could result in an actual deadlock if accounting file is replaced
> from file in overlayfs over "real fs" to file in "real fs".
> acct_on() takes freeze protection on "real fs" and tries to write to
> overlayfs file. overlayfs is not freeze protected so do_acct_process()
> can carry on with __kernel_write() to overlayfs file, which SHOULD
> take "real fs" freeze protection and deadlock.
> Ironically (or not) it doesn't deadlock because of a bug fixed with
> 898cc19d8af2 ovl: fix freeze protection bypass in ovl_write_iter()
> which is not upstream yet, so wasn't in the kernel that syzbot tested.
>
> Following untested patch should fix the alleged deadlock.
>
> Miklos,
>
> If you feel confident about this patch I can re-post it for you to add
> to next. Otherwise, as I didn't see Al around to comment on the patch,
> I will try to reproduce the deadlock.
>
> Thanks,
> Amir.
> ---
>
> diff --git a/kernel/acct.c b/kernel/acct.c
> index addf7732fb56..c09355a7ae46 100644
> --- a/kernel/acct.c
> +++ b/kernel/acct.c
> @@ -251,8 +251,8 @@ static int acct_on(struct filename *pathname)
>         rcu_read_lock();
>         old = xchg(&ns->bacct, &acct->pin);
>         mutex_unlock(&acct->lock);
> -       pin_kill(old);
>         mnt_drop_write(mnt);
> +       pin_kill(old);
>         mntput(mnt);
>         return 0;
>  }



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux