On Fri, Oct 9, 2015 at 7:26 PM, Dzianis Kahanovich <mahatma@xxxxxxxxxxxxxx> wrote: > Yan, Zheng пишет: > >>>> It seems you have 16 mounts. Are you using kernel client or fuse >>>> client, which version are they? >>> >>> >>> >>> 1) On every of 4x ceph node: >>> 4.1.8 kernel mount to /mnt/ceph0 (used only for samba/ctdb lockfile); >>> fuse mount to /mnt/ceph1 (some used); >>> samba cluster (ctdb) with vfs_ceph; >>> 2) On 2 additional out-of-cluster (service) nodes: >>> 4.1.8 (now 4.2.3) kernel mount; >>> 4.1.0 both mounts; >>> 3) 2 VMs: >>> kernel mounts (most active: web & mail); >>> 4.2.3; >>> >>> fuse mounts - same version with ceph; >> >> >> >> please run "ceph daemon mds.x session ls" to find which client has >> largest number of caps. mds.x is ID of active mds. > > > 1) This command is not valid more ;) - "cephfs-table-tool x show session" > now. you need to run this command on machine that runs the active MDS. this command show some in-memory states of session. cephfs-table-tool only show on-disk states of session. > > 2) I have 3 active mds now. I try, it works, keep it. Restart still > problematic. > multiple active MDS is not ready for production. > 3) Yes, more caps on master VM (4.2.3 kernel mount, there are > web+mail+heartbeat cluster 2xVMs) on apache root. In this place was (no more > now) described CLONE_FS -> CLONE_VFORK deadlocks. But 4.2.3 installed just > before tests, was 4.1.8 with similar effects (but log from 4.2.3 on VM > clients). I suspect your problem is due to some mount have too many open files. during mds failover, mds needs to open these files, which take a long time. Yan, Zheng > > Waith this night for MDSs restart. > > > -- > WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com