I also see this from 'ceph health detail':
# ceph health detail
HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1
MDSs behind on trimming
[WRN] FS_DEGRADED: 1 filesystem is degraded
fs slugfs is degraded
[WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache
mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large
(19GB/8GB); 0 inodes in use by clients, 0 stray files
[WRN] MDS_TRIM: 1 MDSs behind on trimming
mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250)
max_segments: 250, num_segments: 127084
MDS cache too large? The mds process is taking up 22GB right now and
starting to swap my server, so maybe it somehow is too large....
On 4/22/24 11:17 AM, Erich Weiler wrote:
Hi All,
We have a somewhat serious situation where we have a cephfs filesystem
(18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of
the active daemons to unstick a bunch of blocked requests, and the
standby went into 'replay' for a very long time, then RAM on that MDS
server filled up, and it just stayed there for a while then eventually
appeared to give up and switched to the standby, but the cycle started
again. So I restarted that MDS, and now I'm in a situation where I see
this:
# ceph fs status
slugfs - 29 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k 12.2k 0
1 resolve slugfs.pr-md-02.sbblqq 0 3 1 0
POOL TYPE USED AVAIL
cephfs_metadata metadata 997G 2948G
cephfs_md_and_data data 0 87.6T
cephfs_data data 773T 175T
STANDBY MDS
slugfs.pr-md-03.mclckv
MDS version: ceph version 18.2.1
(7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable)
It just stays there indefinitely. All my clients are hung. I tried
restarting all MDS daemons and they just went back to this state after
coming back up.
Is there any way I can somehow escape this state of indefinite
replay/resolve?
Thanks so much! I'm kinda nervous since none of my clients have
filesystem access at the moment...
cheers,
erich
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx