Re: help with a fuse-overlayfs hang please

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for your mail Dmitry,

https://syzkaller.appspot.com/bug?id=7c27d8aa6c0f824004399b6b14776c9c7d8dc34d
looks similar though the "mutex_lock_nested" call isn't in the call
trace I report. Do you think it makes sense to report this one as a
new issue and open a bug?

Regards,
Nikhil.

On Wed, 27 Jul 2022 at 11:05, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Wed, 27 Jul 2022 at 07:20, Nikhil Kshirsagar <nkshirsagar@xxxxxxxxx> wrote:
> >
> > Hello Mikolos and Dmitri!
> >
> > I'm trying to debug a fuse-overlayfs hang in the Ubuntu kernel, with versions,
> >
> > fuse_overlayfs: 1.7.1-1 (universe)
> > kernel: 5.15.0-40-generic (server)
> >
> > This happens when fuse-overlayfs
> > (https://github.com/containers/fuse-overlayfs) is stacked on top of
> > squashfuse (https://github.com/vasi/squashfuse) to allow users to
> > quickly start a container from a squashfs file without any privileges.
> >
> > The hang looks like this
> >
> > Jul 26 17:46:31  kernel: INFO: task fuse-overlayfs:326111 blocked for
> > more than 120 seconds.
> > Jul 26 17:46:31  kernel: Tainted: P OE 5.15.0-40-generic #43-Ubuntu
> > Jul 26 17:46:31  kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 26 17:46:31  kernel: task:fuse-overlayfs state:D stack: 0
> > pid:326111 ppid:326103 flags:0x00000002
> > Jul 26 17:46:31  kernel: Call Trace:
> > Jul 26 17:46:31  kernel: <TASK>
> > Jul 26 17:46:31  kernel: __schedule+0x23d/0x590
> > Jul 26 17:46:31  kernel: ? update_load_avg+0x82/0x620
> > Jul 26 17:46:31  kernel: schedule+0x4e/0xb0
> > Jul 26 17:46:31  kernel: schedule_preempt_disabled+0xe/0x10
> > Jul 26 17:46:31  kernel: __mutex_lock.constprop.0+0x263/0x490
> > Jul 26 17:46:31  kernel: ? kmem_cache_alloc+0x1ab/0x2e0
> > Jul 26 17:46:31  kernel: __mutex_lock_slowpath+0x13/0x20
> > Jul 26 17:46:31  kernel: mutex_lock+0x34/0x40
> > Jul 26 17:46:31  kernel: fuse_lock_inode+0x2f/0x40
> > Jul 26 17:46:31  kernel: fuse_lookup+0x48/0x1b0
> > Jul 26 17:46:31  kernel: ? d_alloc_parallel+0x235/0x4b0
> > Jul 26 17:46:31  kernel: ? __legitimize_path+0x2d/0x60
> > Jul 26 17:46:31  kernel: __lookup_slow+0x81/0x150
> > Jul 26 17:46:31  kernel: walk_component+0x141/0x1b0
> > Jul 26 17:46:31  kernel: link_path_walk.part.0.constprop.0+0x23b/0x360
> > Jul 26 17:46:31  kernel: ? path_init+0x2bc/0x3e0
> > Jul 26 17:46:31  kernel: path_lookupat+0x3e/0x1b0
> > Jul 26 17:46:31  kernel: filename_lookup+0xcf/0x1d0
> > Jul 26 17:46:31  kernel: ? __check_object_size+0x19/0x20
> > Jul 26 17:46:31  kernel: ? strncpy_from_user+0x44/0x140
> > Jul 26 17:46:31  kernel: ? getname_flags.part.0+0x4c/0x1b0
> > Jul 26 17:46:31  kernel: user_path_at_empty+0x3f/0x60
> > Jul 26 17:46:31  kernel: path_getxattr+0x4a/0xb0
> > Jul 26 17:46:31  kernel: ? __secure_computing+0xa5/0x110
> > Jul 26 17:46:31  kernel: __x64_sys_lgetxattr+0x21/0x30
> > Jul 26 17:46:31  kernel: do_syscall_64+0x59/0xc0
> > Jul 26 17:46:31  kernel: ? do_syscall_64+0x69/0xc0
> > Jul 26 17:46:31  kernel: ? do_syscall_64+0x69/0xc0
> > Jul 26 17:46:31  kernel: ? irqentry_exit+0x19/0x30
> > Jul 26 17:46:31  kernel: ? exc_page_fault+0x89/0x160
> > Jul 26 17:46:31  kernel: ? asm_exc_page_fault+0x8/0x30
> > Jul 26 17:46:31  kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
> > Jul 26 17:46:31  kernel: RIP: 0033:0x7ffff7e6d2ae
> > Jul 26 17:46:31  kernel: RSP: 002b:00007fffffff7528 EFLAGS: 00000202
> > ORIG_RAX: 00000000000000c0
> > Jul 26 17:46:31  kernel: RAX: ffffffffffffffda RBX: 000055555556d6f0
> > RCX: 00007ffff7e6d2ae
> > Jul 26 17:46:31  kernel: RDX: 00007fffffff8570 RSI: 0000555555566190
> > RDI: 00007fffffff7530
> > Jul 26 17:46:31  kernel: RBP: 0000555555566190 R08: 0000000000000010
> > R09: 0000555555579cf0
> > Jul 26 17:46:31  kernel: R10: 0000000000000010 R11: 0000000000000202
> > R12: 00007fffffff8570
> > Jul 26 17:46:31  kernel: R13: 0000000000000010 R14: 00007fffffff7530
> > R15: 0000000000000000
> > Jul 26 17:46:31  kernel: </TASK>
> >
> > Seems to me the &get_fuse_inode(inode)->mutex cannot be locked,
> >
> > bool fuse_lock_inode(struct inode *inode)
> > {
> >         bool locked = false;
> >
> >         if (!get_fuse_conn(inode)->parallel_dirops) {
> >                 mutex_lock(&get_fuse_inode(inode)->mutex);
> >                 locked = true;
> >         }
> >
> >         return locked;
> > }
> >
> > Please would you be able to help me understand if this is a
> > known/reported issue, and has any fix/patch?
> >
> > Regards,
> > Nikhil.
>
> +linux-fsdevel, syzkaller
>
> Hi Nikhil,
>
> Re known bugs: we have 5 open bugs that mention "fuse" in the title,
> including some task hangs with reproducers:
> https://syzkaller.appspot.com/upstream
> These may be the easiest to check first.
>
> There were also some fixed task hangs in fuse:
> https://syzkaller.appspot.com/upstream/fixed
> But they look old enough, so fixes are probably already in your kernel.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux