CephFS: clients hanging on write with ceph-fuse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've been running into a strange problem with Ceph using ceph-fuse and the filesystem. All the back end nodes are on 10.2.10, the fuse clients are on 10.2.7.

After some hours of runs, some processes get stuck waiting for fuse like:

[root@worker1144 ~]# cat /proc/58193/stack
[<ffffffffa08cd241>] wait_answer_interruptible+0x91/0xe0 [fuse]
[<ffffffffa08cd653>] __fuse_request_send+0x253/0x2c0 [fuse]
[<ffffffffa08cd6d2>] fuse_request_send+0x12/0x20 [fuse]
[<ffffffffa08d69d6>] fuse_send_write+0xd6/0x110 [fuse]
[<ffffffffa08d84d5>] fuse_perform_write+0x2f5/0x5a0 [fuse]
[<ffffffffa08d8a21>] fuse_file_aio_write+0x2a1/0x340 [fuse]
[<ffffffff811fdfbd>] do_sync_write+0x8d/0xd0
[<ffffffff811fe82d>] vfs_write+0xbd/0x1e0
[<ffffffff811ff34f>] SyS_write+0x7f/0xe0
[<ffffffff816975c9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

The cluster is healthy (all OSDs up, no slow requests, etc.).  More details of my investigation efforts are in the bug report I just submitted:
    http://tracker.ceph.com/issues/22008

It looks like the fuse client is asking for some caps that it never thinks it receives from the MDS, so the thread waiting for those caps on behalf of the writing client never wakes up.  The restart of the MDS fixes the problem (since ceph-fuse re-negotiates caps).

Any ideas/suggestions?

Andras

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux