Re: Cephfs mount not recovering after icmp-not-reachable

Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> · Mon, 14 Jun 2021 17:20:48 +0200

Hi,

CephFS clients are blacklisted if they do not react to heartbeat 
packets. The MDS will deny the reconnect:

[ 1815.029831] ceph: mds0 closed our session
[ 1815.029833] ceph: mds0 reconnect start
[ 1815.052219] ceph: mds0 reconnect denied
[ 1815.052229] ceph:  dropping dirty Fw state for ffff9d9085da1340 1099512175611
[ 1815.052231] ceph:  dropping dirty+flushing Fw state for ffff9d9085da1340 1099512175611
[ 1815.273008] libceph: mds0 10.99.10.4:6801 socket closed (con state NEGOTIATING)
[ 1816.033241] ceph: mds0 rejected session
[ 1829.018643] ceph: mds0 hung
[ 1880.088504] ceph: mds0 came back
[ 1880.088662] ceph: mds0 caps renewed
[ 1880.094018] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm
[ 1881.100367] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm
[ 2046.768969] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
[ 2061.731126] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm

This will render the mount useless until a complete remount is 
happening. You can verify this by printing the osd block list after the 
mount point is not usable anymore using the 'ceph osd blocklist ls' command.

The intention of this behavior is the handling of rogue / faulty 
clients. If your client currently hold the caps for important 
directories and the machine has a hardware error (and won't come back 
soon), the access to the directories would be blocked. Other clients 
won't be able to access them until the broken machine comes back. 
Network outage is another example.

You can configure the mds session timeout that triggers blacklisting. 
But keep in mind that simply using a longer timeout may lead to other 
problems in case of real errors.

Regards,

Burkhard

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx