Hi,
we have upgraded our cluster to luminous 12.2.2 and wanted to use a
second MDS for HA purposes. Upgrade itself went well, setting up the
second MDS from the former standby-replay configuration worked, too.
But upon load both MDS got stuck and need to be restarted. It starts
with slow requests:
2017-12-06 20:26:25.756475 7fddc4424700 0 log_channel(cluster) log
[WRN] : slow
request 122.370227 seconds old, received at 2017-12-06
20:24:23.386136: client_
request(client.15057265:2898 getattr pAsLsXsFs #0x100009de0f2 2017-12-06
20:24:2
3.244096 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
0x100009de0f2 is the inode id of the directory we mount as root on most
clients. Running daemonperf for both MDS shows a rising number of
journal segments, accompanied with the corresponding warnings in the
ceph log. We also see other slow requests:
2017-12-06 20:26:25.756488 7fddc4424700 0 log_channel(cluster) log
[WRN] : slow
request 180.346068 seconds old, received at 2017-12-06
20:23:25.410295: client_
request(client.15163105:549847914 getattr pAs #0x100009de0f2/sge-tmp
2017-12-06
20:23:25.406481 caller_uid=1426, caller_gid=1008{}) currently failed to
authpin
local pins
This is a client accessing a sub directory of the mount point.
On the client side (various Ubuntu kernel using kernel based cephfs)
this leads to CPU lockups if the problem is not fixed fast enough. The
clients need a hard reboot to recover.
We have mitigated the problem by disabling the second MDS. The MDS
related configuration is:
[mds.ceph-storage-04]
mds_replay_interval = 10
mds_cache_memory_limit = 10737418240
[mds]
mds_beacon_grace = 60
mds_beacon_interval = 4
mds_session_timeout = 120
Data pool is on replicated HDD storage, meta data pool on replicated
NVME storage. MDS are colocated with OSDs (12 HDD OSDs + 2 NVME OSDs,
128 GB RAM).
The questions are:
- what is the minimum kernel version on clients required for multi mds
setups?
- is the problem described above a known problem, e.g. a result of
http://tracker.ceph.com/issues/21975 ?
Regards,
Burkhard Linke
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com