Hi Dietmar, On Wed, Jun 19, 2024 at 3:44 AM Dietmar Rieder <dietmar.rieder@xxxxxxxxxxx> wrote: > > Hello cephers, > > we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to > get it up again. > > We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) > > It started this night, I got the first HEALTH_WARN emails saying: > > HEALTH_WARN > > --- New --- > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > === Full health status === > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > then it went on with: > > HEALTH_WARN > > --- New --- > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > > --- Cleared --- > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > === Full health status === > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > > > > Then one after another MDS was going to error state: > > HEALTH_WARN > > --- Updated --- > [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error > state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in error > state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error > state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error > state > > > === Full health status === > [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error > state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in error > state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error > state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error > state > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > [WARN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available > have 0; want 1 more > > > In the morning then I tried to restart the MDS in error state but the > kept failing. I then reduced the number of active MDS to 1 > > ceph fs set cephfs max_mds 1 This will not have any positive effect. > And set the filesystem down > > ceph fs set cephfs down true > > I tried to restart the MDS again but now I'm stuck at the following status: Setting the file system "down" wont' do anything here either. What were you trying to accomplish? Restarting the MDS may only add to your problems. > [root@ceph01-b ~]# ceph -s > cluster: > id: aae23c5c-a98b-11ee-b44d-00620b05cac4 > health: HEALTH_WARN > 4 failed cephadm daemon(s) > 1 filesystem is degraded > insufficient standby MDS daemons available > > services: > mon: 3 daemons, quorum cephmon-01,cephmon-03,cephmon-02 (age 2w) > mgr: cephmon-01.dsxcho(active, since 11w), standbys: > cephmon-02.nssigg, cephmon-03.rgefle > mds: 3/3 daemons up > osd: 336 osds: 336 up (since 11w), 336 in (since 3M) > > data: > volumes: 0/1 healthy, 1 recovering > pools: 4 pools, 6401 pgs > objects: 284.69M objects, 623 TiB > usage: 889 TiB used, 3.1 PiB / 3.9 PiB avail > pgs: 6186 active+clean > 156 active+clean+scrubbing > 59 active+clean+scrubbing+deep > > [root@ceph01-b ~]# ceph health detail > HEALTH_WARN 4 failed cephadm daemon(s); 1 filesystem is degraded; > insufficient standby MDS daemons available > [WRN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in unknown state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error state > [WRN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available > have 0; want 1 more > [root@ceph01-b ~]# > [root@ceph01-b ~]# ceph fs status > cephfs - 40 clients > ====== > RANK STATE MDS ACTIVITY DNS INOS > DIRS CAPS > 0 resolve default.cephmon-02.nyfook 12.3k 11.8k > 3228 0 > 1 replay(laggy) default.cephmon-02.duujba 0 0 > 0 0 > 2 resolve default.cephmon-01.pvnqad 15.8k 3541 > 1409 0 > POOL TYPE USED AVAIL > ssd-rep-metadata-pool metadata 295G 63.5T > sdd-rep-data-pool data 10.2T 84.6T > hdd-ec-data-pool data 808T 1929T > MDS version: ceph version 18.2.2 > (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) > > > The end log file of the replay(laggy) default.cephmon-02.duujba shows: > > [...] > -11> 2024-06-19T07:12:38.980+0000 7f90fd117700 1 > mds.1.journaler.pq(ro) _finish_probe_end write_pos = 8673820672 (header > had 8623488918). recovered. > -10> 2024-06-19T07:12:38.980+0000 7f90fd117700 4 mds.1.purge_queue > operator(): open complete > -9> 2024-06-19T07:12:38.980+0000 7f90fd117700 4 mds.1.purge_queue > operator(): recovering write_pos > -8> 2024-06-19T07:12:39.015+0000 7f9104926700 10 monclient: > get_auth_request con 0x55a93ef42c00 auth_method 0 > -7> 2024-06-19T07:12:39.025+0000 7f9105928700 10 monclient: > get_auth_request con 0x55a93ef43400 auth_method 0 > -6> 2024-06-19T07:12:39.038+0000 7f90fd117700 4 mds.1.purge_queue > _recover: write_pos recovered > -5> 2024-06-19T07:12:39.038+0000 7f90fd117700 1 > mds.1.journaler.pq(ro) set_writeable > -4> 2024-06-19T07:12:39.044+0000 7f9105127700 10 monclient: > get_auth_request con 0x55a93ef43c00 auth_method 0 > -3> 2024-06-19T07:12:39.113+0000 7f9104926700 10 monclient: > get_auth_request con 0x55a93ed97000 auth_method 0 > -2> 2024-06-19T07:12:39.123+0000 7f9105928700 10 monclient: > get_auth_request con 0x55a93e903c00 auth_method 0 > -1> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h: > In function 'void interval_set<T, C>::erase(T, T, std::function<bool(T, > T)>) [with T = inodeno_t; C = std::map]' thread 7f90fa912700 time > 2024-06-19T07:12:39.235633+0000 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h: > 568: FAILED ceph_assert(p->first <= start) > > ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x135) [0x7f910c722e15] > 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb] > 3: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t, > std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5] > 4: (EMetaBlob::replay(MDSRank*, LogSegment*, int, > MDPeerUpdate*)+0x4207) [0x55a93c3e76e7] > 5: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81] > 6: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9] > 7: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1] > 8: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca] > 9: clone() Suggest following the recommendations by Xiubo. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx