Re: Urgent help with degraded filesystem needed

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Wed, 19 Jun 2024 11:27:09 -0400

Hi Dietmar,

On Wed, Jun 19, 2024 at 3:44 AM Dietmar Rieder
<dietmar.rieder@xxxxxxxxxxx> wrote:
>
> Hello cephers,
>
> we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to
> get it up again.
>
> We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby)
>
> It started this night, I got the first HEALTH_WARN emails saying:
>
> HEALTH_WARN
>
> --- New ---
> [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure
>          mds.default.cephmon-02.duujba(mds.1): Client
> apollo-10:cephfs_user failing to respond to cache pressure client_id:
> 1962074
>
>
> === Full health status ===
> [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure
>          mds.default.cephmon-02.duujba(mds.1): Client
> apollo-10:cephfs_user failing to respond to cache pressure client_id:
> 1962074
>
>
> then it went on with:
>
> HEALTH_WARN
>
> --- New ---
> [WARN] FS_DEGRADED: 1 filesystem is degraded
>          fs cephfs is degraded
>
> --- Cleared ---
> [WARN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure
>          mds.default.cephmon-02.duujba(mds.1): Client
> apollo-10:cephfs_user failing to respond to cache pressure client_id:
> 1962074
>
>
> === Full health status ===
> [WARN] FS_DEGRADED: 1 filesystem is degraded
>          fs cephfs is degraded
>
>
>
> Then one after another MDS was going to error state:
>
> HEALTH_WARN
>
> --- Updated ---
> [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s)
>          daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error
> state
>          daemon mds.default.cephmon-02.duujba on cephmon-02 is in error
> state
>          daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error
> state
>          daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error
> state
>
>
> === Full health status ===
> [WARN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s)
>          daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error
> state
>          daemon mds.default.cephmon-02.duujba on cephmon-02 is in error
> state
>          daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error
> state
>          daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error
> state
> [WARN] FS_DEGRADED: 1 filesystem is degraded
>          fs cephfs is degraded
> [WARN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available
>          have 0; want 1 more
>
>
> In the morning then I tried to restart the MDS in error state but the
> kept failing. I then reduced the number of active MDS to 1
>
> ceph fs set cephfs max_mds 1

This will not have any positive effect.

> And set the filesystem down
>
> ceph fs set cephfs down true
>
> I tried to restart the MDS again but now I'm stuck at the following status:

Setting the file system "down" wont' do anything here either. What
were you trying to accomplish? Restarting the MDS may only add to your
problems.

> [root@ceph01-b ~]# ceph -s
>    cluster:
>      id:     aae23c5c-a98b-11ee-b44d-00620b05cac4
>      health: HEALTH_WARN
>              4 failed cephadm daemon(s)
>              1 filesystem is degraded
>              insufficient standby MDS daemons available
>
>    services:
>      mon: 3 daemons, quorum cephmon-01,cephmon-03,cephmon-02 (age 2w)
>      mgr: cephmon-01.dsxcho(active, since 11w), standbys:
> cephmon-02.nssigg, cephmon-03.rgefle
>      mds: 3/3 daemons up
>      osd: 336 osds: 336 up (since 11w), 336 in (since 3M)
>
>    data:
>      volumes: 0/1 healthy, 1 recovering
>      pools:   4 pools, 6401 pgs
>      objects: 284.69M objects, 623 TiB
>      usage:   889 TiB used, 3.1 PiB / 3.9 PiB avail
>      pgs:     6186 active+clean
>               156  active+clean+scrubbing
>               59   active+clean+scrubbing+deep
>
> [root@ceph01-b ~]# ceph health detail
> HEALTH_WARN 4 failed cephadm daemon(s); 1 filesystem is degraded;
> insufficient standby MDS daemons available
> [WRN] CEPHADM_FAILED_DAEMON: 4 failed cephadm daemon(s)
>      daemon mds.default.cephmon-01.cepqjp on cephmon-01 is in error state
>      daemon mds.default.cephmon-02.duujba on cephmon-02 is in unknown state
>      daemon mds.default.cephmon-03.chjusj on cephmon-03 is in error state
>      daemon mds.default.cephmon-03.xcujhz on cephmon-03 is in error state
> [WRN] FS_DEGRADED: 1 filesystem is degraded
>      fs cephfs is degraded
> [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available
>      have 0; want 1 more
> [root@ceph01-b ~]#
> [root@ceph01-b ~]# ceph fs status
> cephfs - 40 clients
> ======
> RANK      STATE                 MDS             ACTIVITY   DNS    INOS
> DIRS   CAPS
>   0       resolve     default.cephmon-02.nyfook            12.3k  11.8k
> 3228      0
>   1    replay(laggy)  default.cephmon-02.duujba               0      0
>     0      0
>   2       resolve     default.cephmon-01.pvnqad            15.8k  3541
> 1409      0
>           POOL            TYPE     USED  AVAIL
> ssd-rep-metadata-pool  metadata   295G  63.5T
>    sdd-rep-data-pool      data    10.2T  84.6T
>     hdd-ec-data-pool      data     808T  1929T
> MDS version: ceph version 18.2.2
> (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
>
>
> The end log file of the  replay(laggy)  default.cephmon-02.duujba shows:
>
> [...]
>     -11> 2024-06-19T07:12:38.980+0000 7f90fd117700  1
> mds.1.journaler.pq(ro) _finish_probe_end write_pos = 8673820672 (header
> had 8623488918). recovered.
>     -10> 2024-06-19T07:12:38.980+0000 7f90fd117700  4 mds.1.purge_queue
> operator(): open complete
>      -9> 2024-06-19T07:12:38.980+0000 7f90fd117700  4 mds.1.purge_queue
> operator(): recovering write_pos
>      -8> 2024-06-19T07:12:39.015+0000 7f9104926700 10 monclient:
> get_auth_request con 0x55a93ef42c00 auth_method 0
>      -7> 2024-06-19T07:12:39.025+0000 7f9105928700 10 monclient:
> get_auth_request con 0x55a93ef43400 auth_method 0
>      -6> 2024-06-19T07:12:39.038+0000 7f90fd117700  4 mds.1.purge_queue
> _recover: write_pos recovered
>      -5> 2024-06-19T07:12:39.038+0000 7f90fd117700  1
> mds.1.journaler.pq(ro) set_writeable
>      -4> 2024-06-19T07:12:39.044+0000 7f9105127700 10 monclient:
> get_auth_request con 0x55a93ef43c00 auth_method 0
>      -3> 2024-06-19T07:12:39.113+0000 7f9104926700 10 monclient:
> get_auth_request con 0x55a93ed97000 auth_method 0
>      -2> 2024-06-19T07:12:39.123+0000 7f9105928700 10 monclient:
> get_auth_request con 0x55a93e903c00 auth_method 0
>      -1> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h:
> In function 'void interval_set<T, C>::erase(T, T, std::function<bool(T,
> T)>) [with T = inodeno_t; C = std::map]' thread 7f90fa912700 time
> 2024-06-19T07:12:39.235633+0000
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/include/interval_set.h:
> 568: FAILED ceph_assert(p->first <= start)
>
>   ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef
> (stable)
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x135) [0x7f910c722e15]
>   2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb]
>   3: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t,
> std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5]
>   4: (EMetaBlob::replay(MDSRank*, LogSegment*, int,
> MDPeerUpdate*)+0x4207) [0x55a93c3e76e7]
>   5: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81]
>   6: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9]
>   7: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1]
>   8: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca]
>   9: clone()

Suggest following the recommendations by Xiubo.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx