at this stage we are not so worried about recovery since we moved to our new pacific cluster. The problem arose during one of the nightly syncs of the old cluster to the new cluster. However, we are quite keen to use this as a learning opportunity to see what we can do to bring this filesystem back to life. On Wed, 2022-06-01 at 20:11 -0400, Ramana Venkatesh Raja wrote: > Can you temporarily turn up the MDS debug log level (debug_mds) to > > check what's happening to this MDS during replay? > > ceph config set mds debug_mds 10 > > 2022-06-02 09:32:36.814 7faca6d16700 5 mds.beacon.store06 Sending beacon up:replay seq 195662 2022-06-02 09:32:36.814 7faca6d16700 1 -- [v2:192.168.34.113:6800/3361270776,v1:192.168.34.113:6801/3361270776] --> [v2:192.168.34.179:3300/0,v1:192.168.34.179:6789/0] -- mdsbeacon(196066899/store06 up:replay seq 195662 v200622) v7 -- 0x5603d846d200 con 0x560185920c00 2022-06-02 09:32:36.814 7facab51f700 1 -- [v2:192.168.34.113:6800/3361270776,v1:192.168.34.113:6801/3361270776] <== mon.0 v2:192.168.34.179:3300/0 230794 ==== mdsbeacon(196066899/store06 up:replay seq 195662 v200622) v7 ==== 132+0+0 (crc 0 0 0) 0x5603d846d200 con 0x560185920c00 2022-06-02 09:32:36.814 7facab51f700 5 mds.beacon.store06 received beacon reply up:replay seq 195662 rtt 0 2022-06-02 09:32:37.090 7faca4d12700 2 mds.0.cache Memory usage: total 22446592, rss 18448072, heap 332040, baseline 307464, 0 / 6982189 inodes have caps, 0 caps, 0 caps per inode 2022-06-02 09:32:37.090 7faca4d12700 10 mds.0.cache cache not ready for trimming 2022-06-02 09:32:38.091 7faca4d12700 2 mds.0.cache Memory usage: total 22446592, rss 18448072, heap 332040, baseline 307464, 0 / 6982189 inodes have caps, 0 caps, 0 caps per inode 2022-06-02 09:32:38.091 7faca4d12700 10 mds.0.cache cache not ready for trimming 2022-06-02 09:32:38.320 7faca6515700 1 -- [v2:192.168.34.113:6800/3361270776,v1:192.168.34.113:6801/3361270776] --> [v2:192.168.34.124:6805/1445500,v1:192.168.34.124:6807/1445500] -- mgrreport(unknown.store06 +0-0 packed 1414) v8 -- 0x56018651ae00 con 0x5601869cb400 2022-06-02 09:32:39.092 7faca4d12700 2 mds.0.cache Memory usage: total 22446592, rss 18448072, heap 332040, baseline 307464, 0 / 6982189 inodes have caps, 0 caps, 0 caps per inode 2022-06-02 09:32:39.092 7faca4d12700 10 mds.0.cache cache not ready for trimming 2022-06-02 09:32:40.094 7faca4d12700 2 mds.0.cache Memory usage: total 22446592, rss 18448072, heap 332040, baseline 307464, 0 / 6982189 inodes have caps, 0 caps, 0 caps per inode 2022-06-02 09:32:40.094 7faca4d12700 10 mds.0.cache cache not ready for trimming 2022-06-02 09:32:40.813 7faca6d16700 5 mds.beacon.store06 Sending beacon up:replay seq 195663 2022-06-02 09:32:40.813 7faca6d16700 1 -- [v2:192.168.34.113:6800/3361270776,v1:192.168.34.113:6801/3361270776] --> [v2:192.168.34.179:3300/0,v1:192.168.34.179:6789/0] -- mdsbeacon(196066899/store06 up:replay seq 195663 v200622) v7 -- 0x5603d846d500 con 0x560185920c00 2022-06-02 09:32:40.813 7facab51f700 1 -- [v2:192.168.34.113:6800/3361270776,v1:192.168.34.113:6801/3361270776] <== mon.0 v2:192.168.34.179:3300/0 230795 ==== mdsbeacon(196066899/store06 up:replay seq 195663 v200622) v7 ==== 132+0+0 (crc 0 0 0) 0x5603d846d500 con 0x560185920c00 2022-06-02 09:32:40.813 7facab51f700 5 mds.beacon.store06 received beacon reply up:replay seq 195663 rtt 0 2022-06-02 09:32:41.095 7faca4d12700 2 mds.0.cache Memory usage: total 22446592, rss 18448072, heap 332040, baseline 307464, 0 / 6982189 inodes have caps, 0 caps, 0 caps per inode > > Is the health of the MDS host okay? Is it low on memory? > > plenty [root@store06 ~]# free total used free shared buff/cache a vailable Mem: 131939604 75007512 2646656 3380 54285436 52944852 Swap: 32930300 1800 32928500 > > > The cluster is healthy.> > > Can you share the output of the `ceph status` , `ceph fs status` and > > `ceph --version`? [root@store06 ~]# ceph status cluster: id: ebaa4a8f-5f17-4d57-b83b-a10f0226efaa health: HEALTH_WARN 1 filesystem is degraded services: mon: 3 daemons, quorum store09,store08,store07 (age 10d) mgr: store08(active, since 15h), standbys: store09, store07 mds: one:2/2 {0=store06=up:replay,1=store05=up:resolve} 3 up:standby osd: 116 osds: 116 up (since 10d), 116 in (since 4M) data: pools: 3 pools, 5121 pgs objects: 275.90M objects, 202 TiB usage: 625 TiB used, 182 TiB / 807 TiB avail pgs: 5115 active+clean 6 active+clean+scrubbing+deep [root@store06 ~]# ceph fs status one - 741 clients === +------+---------+---------+----------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+---------+---------+----------+-------+-------+ | 0 | replay | store06 | | 7012k | 6982k | | 1 | resolve | store05 | | 82.9k | 78.4k | +------+---------+---------+----------+-------+-------+ +------------------+----------+-------+-------+ | Pool | type | used | avail | +------------------+----------+-------+-------+ | weddell_metadata | metadata | 111G | 1963G | | weddell_data | data | 622T | 44.0T | +------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | store09 | | store08 | | store07 | +-------------+ MDS version: ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable) [root@store06 ~]# ceph --version ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable) The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx