On Mon, Jul 23, 2018 at 2:22 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > On Mon, Jul 23, 2018 at 2:12 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: >> On Mon, Jul 23, 2018 at 10:11 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot >>> <syzbot+bb6d800770577a083f8c@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>>> Hello, >>>> >>>> syzbot found the following crash on: >>>> >>>> HEAD commit: d72e90f33aa4 Linux 4.18-rc6 >>>> git tree: upstream >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000 >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=68af3495408deac5 >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c >>>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) >>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000 >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000 >>> >>> >>> Hi fuse maintainers, >>> >>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I >>> understand this is mostly working-as-intended (parts about deadlocks >>> in Documentation/filesystems/fuse.txt). The intended way to resolve >>> this is aborting connections via fusectl, right? >> >> Yes. Alternative is with "umount -f". >> >>> The doc says "Under >>> the fuse control filesystem each connection has a directory named by a >>> unique number". The question is: if I start a process and this process >>> can mount fuse, how do I kill it? I mean: totally and certainly get >>> rid of it right away? How do I find these unique numbers for the >>> mounts it created? >> >> It is the device number found in st_dev for the mount. Other than >> doing stat(2) it is possible to find out the device number by reading >> /proc/$PID/mountinfo (third field). > > Thanks. I will try to figure out fusectl connection numbers and see if > it's possible to integrate aborting into syzkaller. > >>> Taking into account that there is usually no >>> operator attached to each server, I wonder if kernel could somehow >>> auto-abort fuse on kill? >> >> Depends on what the fuse server is sleeping on. If it's trying to >> acquire an inode lock (e.g. unlink(2)), which is classical way to >> deadlock a fuse filesystem, then it will go into an uninterruptible >> sleep. There's no way in which that process can be killed except to >> force a release of the offending lock, which can only be done by >> aborting the request that is being performed while holding that lock. > > I understand that it is not killed today, but I am asking if we can > make it killable. It's all code that we can change, and if a human > operator can do it, it can be done pure programmatically on kill too, > right? Hmm, you mean if a process is in an uninterruptible sleep trying to acquire a lock on a fuse filesystem and is killed, then the fuse filesystem should be aborted? Even if we'd manage to implement that, it's a large backward incompatibility risk. I don't argue that it can be done, but I would definitely argue *if* it should be done. Thanks, Miklos