Re: `ls` blocked with SSHFS mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 17, 2020 at 2:52 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Fri, Jul 17, 2020 at 02:39:03PM +0200, Miklos Szeredi wrote:
> > On Fri, Jul 17, 2020 at 10:07 AM Paul Menzel <pmenzel@xxxxxxxxxxxxx> wrote:
> > > [105591.121285] INFO: task ls:21242 blocked for more than 120 seconds.
> > > [105591.121293]       Not tainted 5.7.0-1-amd64 #1 Debian 5.7.6-1
> > > [105591.121295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [105591.121298] ls              D    0 21242    778 0x00004004
> > > [105591.121304] Call Trace:
> > > [105591.121319]  __schedule+0x2da/0x770
> > > [105591.121326]  schedule+0x4a/0xb0
> > > [105591.121339]  request_wait_answer+0x122/0x210 [fuse]
> > > [105591.121349]  ? finish_wait+0x80/0x80
> > > [105591.121357]  fuse_simple_request+0x198/0x290 [fuse]
> > > [105591.121366]  fuse_do_getattr+0xcf/0x2c0 [fuse]
> > > [105591.121376]  vfs_statx+0x96/0xe0
> > >
> > > The `ls` process cannot be killed. The SSHFS issue *Fuse sshfs blocks
> > > standby (Visual Studio Code?)* from 2018 already reported this for Linux
> > > 4.17, and the SSHFS developers asked to report this to the Linux kernel.
> >
> > This is a very old and fundamental issue.   Theoretical solution for
> > killing the stuck process exists, but it's not trivial and since the
> > above mentioned workarounds work well in all cases it's not high
> > priority right now.
>
> What?  All you need to do is return -EINTR from fuse_do_getattr() if
> there's a fatal signal.  What "fundamental issue"?

TL;DR: the fundamental issue is not with getattr, but with ops that
hold locks.  We could make an exception for ops that do not hold
locks, but it would not be a solution to the problem, and as I said
this is not something we can't live with.

The fundamental issue is that  a task killed while the userspace
filesystem is still performing that operation will release the vfs
lock and allow another op requiring that lock tobe sent to the
userspace filesystem.  This may confuse the userspace filesystem
otherwise relying on the locking and quite possibly result in fs
corruption.

To fix this, we need to add shadow locking somewhere that duplicates
the vfs locks but are only released if userspace finished processing
the request.  Best place to put the shadow locks is probably in the
kernel.

Thanks,
Miklos



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux