Re: Cephfs mount not recovering after icmp-not-reachable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


CephFS clients are blacklisted if they do not react to heartbeat packets. The MDS will deny the reconnect:


[ 1815.029831] ceph: mds0 closed our session
[ 1815.029833] ceph: mds0 reconnect start
[ 1815.052219] ceph: mds0 reconnect denied
[ 1815.052229] ceph:  dropping dirty Fw state for ffff9d9085da1340 1099512175611
[ 1815.052231] ceph:  dropping dirty+flushing Fw state for ffff9d9085da1340 1099512175611
[ 1815.273008] libceph: mds0 10.99.10.4:6801 socket closed (con state NEGOTIATING)
[ 1816.033241] ceph: mds0 rejected session
[ 1829.018643] ceph: mds0 hung
[ 1880.088504] ceph: mds0 came back
[ 1880.088662] ceph: mds0 caps renewed
[ 1880.094018] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm
[ 1881.100367] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm
[ 2046.768969] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
[ 2061.731126] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm


This will render the mount useless until a complete remount is happening. You can verify this by printing the osd block list after the mount point is not usable anymore using the 'ceph osd blocklist ls' command.


The intention of this behavior is the handling of rogue / faulty clients. If your client currently hold the caps for important directories and the machine has a hardware error (and won't come back soon), the access to the directories would be blocked. Other clients won't be able to access them until the broken machine comes back. Network outage is another example.

You can configure the mds session timeout that triggers blacklisting. But keep in mind that simply using a longer timeout may lead to other problems in case of real errors.


Regards,

Burkhard

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux