On Mon, 2019-12-02 at 21:12 +0800, norman wrote: > I found a problem in my cephfs kernel client, a thread hung for days, > and I check the stack > > debug-user@CEPH0207:/home/debug-user$ sudo cat /proc/46071/stack > [<ffffffffc0e8fb60>] ceph_mdsc_do_request+0x180/0x240 [ceph] > [<ffffffffc0e70e51>] __ceph_do_getattr+0xd1/0x1e0 [ceph] > [<ffffffffc0e70fcc>] ceph_getattr+0x2c/0x100 [ceph] > [<ffffffffbc05b943>] vfs_getattr_nosec+0x73/0x90 > [<ffffffffbc05b996>] vfs_getattr+0x36/0x40 > [<ffffffffbc05baae>] vfs_statx+0x8e/0xe0 > [<ffffffffbc05c00d>] SYSC_newstat+0x3d/0x70 > [<ffffffffbc05c7ae>] SyS_newstat+0xe/0x10 > [<ffffffffbc8001a1>] entry_SYSCALL_64_fastpath+0x24/0xab > [<ffffffffffffffff>] 0xffffffffffffffff > > and I found the the session has lost its connection, > > debug-user@CEPH0207:/home/debug-user$ sudo cat > /sys/kernel/debug/ceph/64803197-c207-4012-b8f3-18825d34196c.client15099020/mds_sessions > global_id 15099020 > name "text" > mds.0 reconnecting > > I guess the client has been in the black list, but it's not, someone can > give me some ideas about how to solve the problem or it's a known bug? > Thanks. > > The envrionment info: > > OS: Ubuntu > > kernel: linux-image-4.13.0-36-generic > > cpeh version: luminous v4.13 based kernels are pretty old at this point, so I'd not spend a lot of time troubleshooting if you have the ability to update to something (much) newer. I know there have been some bugs fixed within the last year or two that exhibited symptoms like that, but I don't see the specific commits, right offhand. Cheers, -- Jeff Layton <jlayton@xxxxxxxxxx>