Re: Still a pretty bad time on 5.4.6 with fuse_request_end.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Michael Stapelberg <michael+lkml@xxxxxxxxxxxxx>

Hey,

I recently ran into this, too. The symptom for me is that processes using the
affected FUSE file system hang indefinitely, sync(2) system calls hang
indefinitely, and even triggering an abort via echo 1 >
/sys/fs/fuse/connections/*/abort does not get the file system unstuck (there is
always 1 request still pending). Only removing power will get the machine
unstuck.

I’m triggering this when building packages for https://distr1.org/, which uses a
FUSE daemon (written in Go using the jacobsa/fuse package) to provide package
contents.

I bisected the issue to commit
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2b319d1f6f92a4ced9897678113d176ee16ae85d

With that commit, I run into a kernel oops within ≈1 minute after starting my
batch build. With the commit before, I can batch build for many minutes without
issues.

…and just in case it matters, building linux from HEAD
(f757165705e92db62f85a1ad287e9251d1f2cd82) with that commit reverted results in
a working kernel, too.

Find below a backtrace full from kgdb, with fs/fuse/dev.c compiled with -O0:

(gdb) bt full
#0  0xffff888139a36600 in ?? ()
No symbol table info available.
#1  0xffffffff8137b368 in fuse_request_end (fc=0xffff888139a36600,
req=0xffff8880b7f333b8) at fs/fuse/dev.c:328
        fiq = 0xffff888139a36648
        async = true
#2  0xffffffff8137f488 in fuse_dev_do_write (fud=0xffff888139a36600,
cs=0xffffc9000dd7fa58, nbytes=4294967294) at fs/fuse/dev.c:1911
        err = 0
        fc = 0xffff888139a36600
        fpq = 0xffff8881390e5148
        req = 0xffff8880b7f333b8
        oh = {len = 16, error = -2, unique = 2692038}
#3  0xffffffff8137f569 in fuse_dev_write (iocb=0xffffc9000093be48,
from=0xffffc9000093be20) at fs/fuse/dev.c:1933
        cs = {write = 0, req = 0xffff8880b7f333b8, iter =
0xffffc9000093be20, pipebufs = 0x0 <fixed_percpu_data>, currbuf = 0x0
<fixed_percpu_data>, pipe = 0x0 <fixed_percpu_data>, nr_segs = 0,
          pg = 0x0 <fixed_percpu_data>, len = 0, offset = 24, move_pages = 0}
        fud = 0xffff8881390e5140
#4  0xffffffff811fe4de in call_write_iter (file=<optimized out>,
iter=<optimized out>, kio=<optimized out>) at
./include/linux/fs.h:1902
No locals.
#5  new_sync_write (filp=0xffff888123800800, buf=<optimized out>,
len=<optimized out>, ppos=0xffffc9000093bee8) at fs/read_write.c:483
        iov = {iov_base = 0xc00082a008, iov_len = 16}
        kiocb = {ki_filp = 0xffff888123800800, ki_pos = 0, ki_complete
= 0x0 <fixed_percpu_data>, private = 0x0 <fixed_percpu_data>, ki_flags
= 0, ki_hint = 0, ki_ioprio = 0, ki_cookie = 0}
        iter = {type = 5, iov_offset = 0, count = 0, {iov =
0xffffc9000093be20, kvec = 0xffffc9000093be20, bvec =
0xffffc9000093be20, pipe = 0xffffc9000093be20}, {nr_segs = 0, {head =
0,
              start_head = 0}}}
        ret = <optimized out>
#6  0xffffffff811fe594 in __vfs_write (file=<optimized out>,
p=<optimized out>, count=<optimized out>, pos=<optimized out>) at
fs/read_write.c:496
No locals.
#7  0xffffffff81200fa4 in vfs_write (pos=<optimized out>, count=16,
buf=<optimized out>, file=<optimized out>) at fs/read_write.c:558
        ret = 16
        ret = <optimized out>
#8  vfs_write (file=0xffff888123800800, buf=0xc00082a008 "\020",
count=16, pos=0xffffc9000093bee8) at fs/read_write.c:542
        ret = 16
#9  0xffffffff81201252 in ksys_write (fd=<optimized out>,
buf=0xc00082a008 "\020", count=16) at fs/read_write.c:611
        pos = 0
        ppos = <optimized out>
        f = <optimized out>
        ret = 824642281480
#10 0xffffffff812012e5 in __do_sys_write (count=<optimized out>,
buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:623
No locals.
#11 __se_sys_write (count=<optimized out>, buf=<optimized out>,
fd=<optimized out>) at fs/read_write.c:620
        ret = <optimized out>
        ret = <optimized out>
#12 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:620
No locals.
#13 0xffffffff810027f8 in do_syscall_64 (nr=<optimized out>,
regs=0xffffc9000093bf58) at arch/x86/entry/common.c:294
        ti = <optimized out>
#14 0xffffffff81e0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:175
No locals.
#15 0x0000000000000000 in ?? ()
No symbol table info available.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux