Hi, I had a problem with a cephfs freeze in a client. Impossible to re-enable the mountpoint. A simple "ls /mnt" command totally blocked (of course impossible to umount-remount etc.) and I had to reboot the host. But even a "normal" reboot didn't work, the host didn't stop. I had to do a hard reboot of the host. In brief, it was like a big "NFS" freeze. ;) In the logs, nothing relevant in the client side and just this line in the cluster side: ~# cat /var/log/ceph/ceph-mds.1.log [...] 2015-05-14 17:07:17.259866 7f3b5cffc700 0 log_channel(cluster) log [INF] : closing stale session client.1342358 192.168.21.207:0/519924348 after 301.329013 [...] And indeed, the freeze was probably triggered by a little network interruption. Here is my configuration: - OS: Ubuntu 14.04 in the client and in the cluster nodes. - Kernel: 3.16.0-36-generic in the client and in the cluster nodes. (apt-get install linux-image-generic-lts-utopic). - Ceph version: Hammer in the client and in cluster nodes (0.94.1-1trusty). In the client, I use the cephfs kernel module (not ceph-fuse). Here is the fstab line in the client node: 10.0.2.150,10.0.2.151,10.0.2.152:/ /mnt ceph noatime,noacl,name=cephfs,secretfile=/etc/ceph/secret,_netdev 0 0 My only configuration concerning mds in ceph.conf is just: mds cache size = 1000000 That's all. Here are my questions: 1. Is this kind of freeze normal? Can I avoid these freezes with a more recent version of the kernel in the client? 2. Can I avoid these freezes with ceph-fuse instead of the kernel cephfs module? But in this case, the cephfs performance will be worse. Am I wrong? 3. Is there a parameter in ceph.conf to tell mds to be more patient before closing the "stale session" of a client? I'm in a testing period and a hard reboot of my cephfs clients would be quite annoying for me. Thanks in advance for your help. -- François Lafont _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com