On Wed, Feb 3, 2016 at 2:35 PM, Zhi Zhang <zhang.david2011@xxxxxxxxx> wrote: > Hi Yan, > > ceph-fuse version: ceph-fuse-0.94.1 with some backports > > fuse version: > fuse-libs-2.9.2-5.el7.x86_64 > fuse-2.9.2-5.el7.x86_64 > > kernel version: 3.10.90-1 > > Please see the gdb backtrace file attached. Seems like a MDS issue. Please run "ceph mds tell 0 dumpcache". then send /cachedump.xxx.mds0 and ceph-fuse backtrace to us. Regards Yan, Zheng > > > Regards, > Zhi Zhang (David) > Contact: zhang.david2011@xxxxxxxxx > zhangz.david@xxxxxxxxxxx > > > On Wed, Feb 3, 2016 at 2:07 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >> Which version kernel/ceph-fuse are you using? Besides, please use gdb >> to get backtrace of all threads in ceph-fuse and send the backtrace to >> us. >> >> Regards >> Yan, Zheng >> >> On Wed, Feb 3, 2016 at 12:54 PM, Zhi Zhang <zhang.david2011@xxxxxxxxx> wrote: >>> Hi ceph-devel, >>> >>> I know this mail might be better to send to fuse-devel, but I haven't >>> been approved by fuse-devel yet since I subscribed it. >>> >>> Occasionally we met ceph-fuse hanging issue when mapping ceph-fuse >>> directory into docker container using aufs. >>> >>> Here is the dmesg: >>> >>> [809401.613923] aufs au_opts_verify:1602:docker[36838]: dirperm1 >>> breaks the protection by the permission bits on the lower branch >>> [825359.968412] aufs au_opts_verify:1602:docker[32013]: dirperm1 >>> breaks the protection by the permission bits on the lower branch >>> [825359.970719] aufs au_opts_verify:1602:docker[32013]: dirperm1 >>> breaks the protection by the permission bits on the lower branch >>> [825359.973689] aufs au_opts_verify:1602:docker[44954]: dirperm1 >>> breaks the protection by the permission bits on the lower branch >>> [836447.630952] INFO: task df:30614 blocked for more than 120 seconds. >>> [836447.630955] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> disables this message. >>> [836447.630957] df D ffff88203f851e00 0 30614 1 0x00000004 >>> [836447.630960] ffff880a27b6fd08 0000000000000002 ffff880a27b6ffd8 >>> 0000000000011e00 >>> [836447.630964] ffff880a27b6ffd8 0000000000011e00 ffff880fe93f4b60 >>> ffff880009f98000 >>> [836447.630967] ffff881074050320 ffff881fd6333000 ffff880a27b6fd30 >>> ffff881074050400 >>> [836447.630970] Call Trace: >>> [836447.630977] [<ffffffff81acb689>] schedule+0x29/0x70 >>> [836447.630983] [<ffffffff812be96d>] __fuse_request_send+0xdd/0x290 >>> [836447.630987] [<ffffffff81066150>] ? wake_up_bit+0x30/0x30 >>> [836447.630989] [<ffffffff812beb32>] fuse_request_send+0x12/0x20 >>> [836447.630992] [<ffffffff812c3859>] fuse_do_getattr+0x109/0x2a0 >>> [836447.630995] [<ffffffff812c4cd5>] fuse_update_attributes+0x75/0x80 >>> [836447.630997] [<ffffffff812c4d23>] fuse_getattr+0x43/0x50 >>> [836447.631001] [<ffffffff81168249>] vfs_getattr+0x29/0x40 >>> [836447.631002] [<ffffffff811683b2>] vfs_fstatat+0x62/0xa0 >>> [836447.631004] [<ffffffff8116859f>] SYSC_newstat+0x1f/0x40 >>> [836447.631009] [<ffffffff8100e4f8>] ? syscall_trace_enter+0x18/0x210 >>> [836447.631012] [<ffffffff81ad50bc>] ? tracesys+0x7e/0xe2 >>> [836447.631014] [<ffffffff811689ee>] SyS_newstat+0xe/0x10 >>> [836447.631016] [<ffffffff81ad511b>] tracesys+0xdd/0xe2 >>> >>> >>> From above msg, it seems to be hung at __fuse_request_send, which will >>> queue request and wait for reply from ceph-fuse. When this happens, >>> 'ls' ceph-fuse directory or 'df' outside docker container will also >>> get hang. >>> >>> I generated ceph-fuse's core dump and found client_lock was not held >>> by any thread. So I wonder if something wrong with ceph-fuse that >>> can't get the request from the fuse's queue? Or something else related >>> to fuse itself? >>> >>> Thanks. >>> >>> Regards, >>> Zhi Zhang (David) >>> Contact: zhang.david2011@xxxxxxxxx >>> zhangz.david@xxxxxxxxxxx >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html