Hi Dietmar, have you already blocked all cephfs clients? Joachim *Joachim Kraftmayer* CEO | p: +49 89 2152527-21 | e: joachim.kraftmayer@xxxxxxxxx a: Loristr. 8 | 80335 Munich | Germany | w: https://clyso.com | Utting a. A. | HR: Augsburg | HRB 25866 | USt. ID: DE275430677 Am Mi., 19. Juni 2024 um 09:44 Uhr schrieb Dietmar Rieder < dietmar.rieder@xxxxxxxxxxx>: > Hello cephers, > > we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to > get it up again. > > We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) > > It started this night, I got the first HEALTH_WARN emails saying: > > HEALTH_WARN > > --- New --- > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > === Full health status === > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > then it went on with: > > HEALTH_WARN > > --- New --- > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > > --- Cleared --- > [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure > mds.default.cephmon-02.duujba(mds.1): Client > apollo-10:cephfs_user failing to respond to cache pressure client_id: > 1962074 > > > === Full health status === > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > > > > Then one after another MDS was going to error state: > > HEALTH_WARN > > --- Updated --- > [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error > state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in error > state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error > state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error > state > > > === Full health status === > [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error > state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in error > state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error > state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error > state > [WARN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > [WARN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available > have 0; want 1 more > > > In the morning then I tried to restart the MDS in error state but the > kept failing. I then reduced the number of active MDS to 1 > > ceph fs set cephfs max_mds 1 > > And set the filesystem down > > ceph fs set cephfs down true > > I tried to restart the MDS again but now I'm stuck at the following status: > > > [root@ceph01-b ~]# ceph -s > cluster: > id: aae23c5c-a98b-11ee-b44d-00620b05cac4 > health: HEALTH_WARN > 4 failed cephadm daemon(s) > 1 filesystem is degraded > insufficient standby MDS daemons available > > services: > mon: 3 daemons, quorum cephmon-01,cephmon-03,cephmon-02 (age 2w) > mgr: cephmon-01.dsxcho(active, since 11w), standbys: > cephmon-02.nssigg, cephmon-03.rgefle > mds: 3/3 daemons up > osd: 336 osds: 336 up (since 11w), 336 in (since 3M) > > data: > volumes: 0/1 healthy, 1 recovering > pools: 4 pools, 6401 pgs > objects: 284.69M objects, 623 TiB > usage: 889 TiB used, 3.1 PiB / 3.9 PiB avail > pgs: 6186 active+clean > 156 active+clean+scrubbing > 59 active+clean+scrubbing+deep > > [root@ceph01-b ~]# ceph health detail > HEALTH_WARN 4 failed cephadm daemon(s); 1 filesystem is degraded; > insufficient standby MDS daemons available > [WRN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s) > daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error state > daemon mds.default.cephmon-02.duujba on cephmon-02 is in unknown state > daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error state > daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error state > [WRN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available > have 0; want 1 more > [root@ceph01-b ~]# > [root@ceph01-b ~]# ceph fs status > cephfs - 40 clients > ====== > RANK STATE MDS ACTIVITY DNS INOS > DIRS CAPS > 0 resolve default.cephmon-02.nyfook 12.3k 11.8k > 3228 0 > 1 replay(laggy) default.cephmon-02.duujba 0 0 > 0 0 > 2 resolve default.cephmon-01.pvnqad 15.8k 3541 > 1409 0 > POOL TYPE USED AVAIL > ssd-rep-metadata-pool metadata 295G 63.5T > sdd-rep-data-pool data 10.2T 84.6T > hdd-ec-data-pool data 808T 1929T > MDS version: ceph version 18.2.2 > (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) > > > The end log file of the replay(laggy) default.cephmon-02.duujba shows: > > [...] > -11> 2024-06-19T07:12:38.980+0000 7f90fd117700 1 > mds.1.journaler.pq(ro) _finish_probe_end write_pos = 8673820672 (header > had 8623488918). recovered. > -10> 2024-06-19T07:12:38.980+0000 7f90fd117700 4 mds.1.purge_queue > operator(): open complete > -9> 2024-06-19T07:12:38.980+0000 7f90fd117700 4 mds.1.purge_queue > operator(): recovering write_pos > -8> 2024-06-19T07:12:39.015+0000 7f9104926700 10 monclient: > get_auth_request con 0x55a93ef42c00 auth_method 0 > -7> 2024-06-19T07:12:39.025+0000 7f9105928700 10 monclient: > get_auth_request con 0x55a93ef43400 auth_method 0 > -6> 2024-06-19T07:12:39.038+0000 7f90fd117700 4 mds.1.purge_queue > _recover: write_pos recovered > -5> 2024-06-19T07:12:39.038+0000 7f90fd117700 1 > mds.1.journaler.pq(ro) set_writeable > -4> 2024-06-19T07:12:39.044+0000 7f9105127700 10 monclient: > get_auth_request con 0x55a93ef43c00 auth_method 0 > -3> 2024-06-19T07:12:39.113+0000 7f9104926700 10 monclient: > get_auth_request con 0x55a93ed97000 auth_method 0 > -2> 2024-06-19T07:12:39.123+0000 7f9105928700 10 monclient: > get_auth_request con 0x55a93e903c00 auth_method 0 > -1> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h: > > In function 'void interval_set<T, C>::erase(T, T, std::function<bool(T, > T)>) [with T = inodeno_t; C = std::map]' thread 7f90fa912700 time > 2024-06-19T07:12:39.235633+0000 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h: > > 568: FAILED ceph_assert(p->first <= start) > > ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x135) [0x7f910c722e15] > 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb] > 3: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t, > std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5] > 4: (EMetaBlob::replay(MDSRank*, LogSegment*, int, > MDPeerUpdate*)+0x4207) [0x55a93c3e76e7] > 5: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81] > 6: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9] > 7: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1] > 8: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca] > 9: clone() > > 0> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1 *** Caught signal > (Aborted) ** > in thread 7f90fa912700 thread_name:md_log_replay > > ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef > (stable) > 1: /lib64/libpthread.so.0(+0x12d20) [0x7f910b4d2d20] > 2: gsignal() > 3: abort() > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x18f) [0x7f910c722e6f] > 5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb] > 6: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t, > std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5] > 7: (EMetaBlob::replay(MDSRank*, LogSegment*, int, > MDPeerUpdate*)+0x4207) [0x55a93c3e76e7] > 8: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81] > 9: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9] > 10: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1] > 11: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca] > 12: clone() > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > > --- logging levels --- > 0/ 5 none > 0/ 1 lockdep > 0/ 1 context > 1/ 1 crush > 1/ 5 mds > 1/ 5 mds_balancer > 1/ 5 mds_locker > 1/ 5 mds_log > 1/ 5 mds_log_expire > 1/ 5 mds_migrator > 0/ 1 buffer > 0/ 1 timer > 0/ 1 filer > 0/ 1 striper > 0/ 1 objecter > 0/ 5 rados > 0/ 5 rbd > 0/ 5 rbd_mirror > 0/ 5 rbd_replay > 0/ 5 rbd_pwl > 0/ 5 journaler > 0/ 5 objectcacher > 0/ 5 immutable_obj_cache > 0/ 5 client > 1/ 5 osd > 0/ 5 optracker > 0/ 5 objclass > 1/ 3 filestore > 1/ 3 journal > 0/ 0 ms > 1/ 5 mon > 0/10 monc > 1/ 5 paxos > 0/ 5 tp > 1/ 5 auth > 1/ 5 crypto > 1/ 1 finisher > 1/ 1 reserver > 1/ 5 heartbeatmap > 1/ 5 perfcounter > 1/ 5 rgw > 1/ 5 rgw_sync > 1/ 5 rgw_datacache > 1/ 5 rgw_access > 1/ 5 rgw_dbstore > 1/ 5 rgw_flight > 1/ 5 javaclient > 1/ 5 asok > 1/ 1 throttle > 0/ 0 refs > 1/ 5 compressor > 1/ 5 bluestore > 1/ 5 bluefs > 1/ 3 bdev > 1/ 5 kstore > 4/ 5 rocksdb > 4/ 5 leveldb > 1/ 5 fuse > 2/ 5 mgr > 1/ 5 mgrc > 1/ 5 dpdk > 1/ 5 eventtrace > 1/ 5 prioritycache > 0/ 5 test > 0/ 5 cephfs_mirror > 0/ 5 cephsqlite > 0/ 5 seastore > 0/ 5 seastore_onode > 0/ 5 seastore_odata > 0/ 5 seastore_omap > 0/ 5 seastore_tm > 0/ 5 seastore_t > 0/ 5 seastore_cleaner > 0/ 5 seastore_epm > 0/ 5 seastore_lba > 0/ 5 seastore_fixedkv_tree > 0/ 5 seastore_cache > 0/ 5 seastore_journal > 0/ 5 seastore_device > 0/ 5 seastore_backref > 0/ 5 alienstore > 1/ 5 mclock > 0/ 5 cyanstore > 1/ 5 ceph_exporter > 1/ 5 memstore > -2/-2 (syslog threshold) > -1/-1 (stderr threshold) > --- pthread ID / name mapping for recent threads --- > 7f90fa912700 / md_log_replay > 7f90fb914700 / > 7f90fc115700 / MR_Finisher > 7f90fd117700 / PQ_Finisher > 7f90fe119700 / ms_dispatch > 7f910011d700 / ceph-mds > 7f9102121700 / ms_dispatch > 7f9103123700 / io_context_pool > 7f9104125700 / admin_socket > 7f9104926700 / msgr-worker-2 > 7f9105127700 / msgr-worker-1 > 7f9105928700 / msgr-worker-0 > 7f910d8eab00 / ceph-mds > max_recent 10000 > max_new 1000 > log_file /var/log/ceph/ceph-mds.default.cephmon-02.duujba.log > --- end dump of recent events --- > > > I have no idea how to resolve this and would be grateful for any help. > > Dietmar > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx