Dear all,
I have a full-flash NVME ceph cluster (16.2.6) with currently only cephfs service configured.
55 nodes, 2 partitions for each NVME. I increased the MDS cache memory limit to 128GB (256GB per admin node). It's an hyperconverged K8S cluster, OSD are on K8s worker nodes, so I set "osd
memory target" to 16GB.
I have many times those warnings, cephfs client are blocked and I must restart mds service to fix this.
X slow requests, 0 included below; oldest blocked for > 33860.402867 secs
mds.icadmin006(mds.1): X slow requests are blocked > 30 secs
X clients failing to respond to cache pressure (MDS_CLIENT_RECALL)
Health check update: X MDSs report slow requests (MDS_SLOW_REQUEST)
What can I do to avoid this behaviour ?
some useful information :
ceph has been deployed with ceph-ansible on ubuntu 20.04 with kernel 5.4.0-90-generic
> $ ceph -s
cluster:
id: cc402f2e-2444-473e-adab-fe7b38d08546
health: HEALTH_OK
services:
mon: 3 daemons, quorum icadmin006,icadmin007,icadmin008 (age 8w)
mgr: icadmin008(active, since 2w), standbys: icadmin007, icadmin006
mds: 2/2 daemons up, 1 standby
osd: 110 osds: 110 up (since 20h), 110 in (since 7d)
data:
volumes: 1/1 healthy
pools: 3 pools, 4225 pgs
objects: 31.84M objects, 24 TiB
usage: 71 TiB used, 269 TiB / 340 TiB avail
pgs: 4225 active+clean
> $ ceph osd pool ls detail | grep cephfs
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 4096 pgp_num 4096 autoscale_mode on last_change 17466 lfor 0/0/213 flags hashpspool stripe_width 0 target_size_ratio 0.2 application cephfs
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode off last_change 17555 lfor 0/17519/17525 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application ceph
> $ ceph fs status
cephfs - 76 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active icadmin008 Reqs: 58 /s 261k 259k 5213 90.0k
1 active icadmin006 Reqs: 22 /s 176k 170k 30.2k 77.7k
POOL TYPE USED AVAIL
cephfs_metadata metadata 45.8G 82.9T
cephfs_data data 71.1T 82.9T
STANDBY MDS
icadmin007
MDS version: ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)
Thanks for your help,
Best regards,
--
Yoann Moulin
EPFL IC-IT
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx