Re: MDS in ReadOnly and 2 MDS behind on trimming

Edouard FAZENDA <e.fazenda@xxxxxxx> · Fri, 23 Feb 2024 13:12:58 +0000

Hi Eugen,

Thanks for the reply, really appreciate

The first command , just hang with no output 
# cephfs-journal-tool --rank=cephfs:0 --journal=mdlog journal inspect

The second command 

# cephfs-journal-tool --rank=cephfs:0 --journal=purge_queue journal inspect
Overall journal integrity: OK

root@rke-sh1-2:~# cephadm logs --fsid fcb373ce-7aaa-11eb-984f-e7c6e0038e87 --name mds.cephfs.rke-sh1-2.isqjza
-- Logs begin at Fri 2024-02-23 04:49:32 UTC, end at Fri 2024-02-23 13:08:22 UTC. --
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: ignoring --setuser ceph since I am not root
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: ignoring --setgroup ceph since I am not root
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: starting mds.cephfs.rke-sh1-2.isqjza at
Feb 23 08:15:06 rke-sh1-2 bash[1058012]: debug 2024-02-23T08:15:06.371+0000 7fbc17dd9700 -1 mds.pinger is_rank_lagging: rank=0 was never sent ping request.
Feb 23 08:15:13 rke-sh1-2 bash[1058012]: debug 2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 log_channel(cluster) log [ERR] : failed to commit dir 0x1 object, errno -22
Feb 23 08:15:13 rke-sh1-2 bash[1058012]: debug 2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 mds.0.12487 unhandled write error (22) Invalid argument, force readonly...
Feb 23 10:20:36 rke-sh1-2 bash[1058012]: debug 2024-02-23T10:20:36.309+0000 7fbc17dd9700 -1 mds.pinger is_rank_lagging: rank=1 was never sent ping request.

root@rke-sh1-3:~# cephadm logs --fsid fcb373ce-7aaa-11eb-984f-e7c6e0038e87 --name mds.cephfs.rke-sh1-3.vdicdn
-- Logs begin at Fri 2024-02-23 06:59:48 UTC, end at Fri 2024-02-23 13:09:18 UTC. --
Feb 23 07:46:46 rke-sh1-3 bash[2901]: ignoring --setuser ceph since I am not root
Feb 23 07:46:46 rke-sh1-3 bash[2901]: ignoring --setgroup ceph since I am not root
Feb 23 07:46:46 rke-sh1-3 bash[2901]: starting mds.cephfs.rke-sh1-3.vdicdn at
Feb 23 10:25:51 rke-sh1-3 bash[2901]: ignoring --setuser ceph since I am not root
Feb 23 10:25:51 rke-sh1-3 bash[2901]: ignoring --setgroup ceph since I am not root
Feb 23 10:25:51 rke-sh1-3 bash[2901]: starting mds.cephfs.rke-sh1-3.vdicdn at

debug2: channel 0: request window-change confirm 0
debug3: send packet: type 98
-- Logs begin at Fri 2024-02-23 00:24:42 UTC, end at Fri 2024-02-23 13:09:55 UTC. --
Feb 23 09:29:10 rke-sh1-1 bash[786820]: tcmalloc: large alloc 1073750016 bytes == 0x5598512de000 @  0x7fb426636760 0x7fb426657c64 0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4 0x7fb41da6>
Feb 23 09:29:19 rke-sh1-1 bash[786820]: tcmalloc: large alloc 2147491840 bytes == 0x559891ae0000 @  0x7fb426636760 0x7fb426657c64 0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4 0x7fb41db3>
Feb 23 09:29:26 rke-sh1-1 bash[786820]: tcmalloc: large alloc 2147491840 bytes == 0x559951ae4000 @  0x7fb426636760 0x7fb426657c64 0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4 0x7fb41da6>
Feb 23 09:29:27 rke-sh1-1 bash[786820]: debug 2024-02-23T09:29:27.928+0000 7fb416d63700 -1 asok(0x5597c3904000) AdminSocket: error writing response length (32) Broken pipe
Feb 23 12:35:53 rke-sh1-1 bash[786820]: ignoring --setuser ceph since I am not root
Feb 23 12:35:53 rke-sh1-1 bash[786820]: ignoring --setgroup ceph since I am not root
Feb 23 12:35:53 rke-sh1-1 bash[786820]: starting mds.cephfs.rke-sh1-1.ojmpnk at

The logs of the MDS are in verbose 20 , do you want me to provide on a archive ? 

Is there a way to compact all the logs ? 

Best Regards, 

Edouard FAZENDA
Technical Support

Chemin du Curé-Desclouds 2, CH-1226 THONEX  +41 (0)22 869 04 40

www.csti.ch

-----Original Message-----
From: Eugen Block <eblock@xxxxxx> 
Sent: vendredi, 23 février 2024 12:50
To: ceph-users@xxxxxxx
Subject:  Re: MDS in ReadOnly and 2 MDS behind on trimming

Hi,

the mds log should contain information why it goes into read-only mode. Just a few weeks ago I helped a user with a broken CephFS (MDS went into read-only mode because of missing objects in the journal).  
Can you check the journal status:

# cephfs-journal-tool --rank=cephfs:0 --journal=mdlog journal inspect

# cephfs-journal-tool --rank=cephfs:0 --journal=purge_queue journal inspect

and also share the logs.

Thanks,
Eugen

Zitat von Edouard FAZENDA <e.fazenda@xxxxxxx>:

> Dear Ceph Community,
>
>
>
> I am having an issue with my Ceph Cluster , there were several osd 
> crashing but now active and recovery finished and now the CephFS 
> filesystem cannot be access by clients in RW (K8S worklod) as the 1 
> MDS is in Read-Only and 2 are being on trimming
>
>
>
> The cephfs seems to have volume OK
>
>
>
> The trimming process seems not going further, maybe stuck ?
>
>
>
> We are running 3 hosts using ceph Pacific version 16.2.1
>
>
>
> Here some logs on the situation :
>
>
>
> ceph versions
>
> {
>
>     "mon": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 3
>
>     },
>
>     "mgr": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 3
>
>     },
>
>     "osd": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 18
>
>     },
>
>     "mds": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 3
>
>     },
>
>     "rgw": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 6
>
>     },
>
>     "overall": {
>
>         "ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)": 33
>
>     }
>
> }
>
>
>
> ceph orch ps
>
> NAME                          HOST       STATUS         REFRESHED  AGE
> PORTS          VERSION  IMAGE ID      CONTAINER ID
>
> crash.rke-sh1-1               rke-sh1-1  running (21h)  36s ago    21h  -
> 16.2.1   c757e4a3636b  e8652edb2b49
>
> crash.rke-sh1-2               rke-sh1-2  running (21h)  3m ago     20M  -
> 16.2.1   c757e4a3636b  a1249a605ee0
>
> crash.rke-sh1-3               rke-sh1-3  running (17h)  36s ago    17h  -
> 16.2.1   c757e4a3636b  026667bc1776
>
> mds.cephfs.rke-sh1-1.ojmpnk   rke-sh1-1  running (18h)  36s ago    4M   -
> 16.2.1   c757e4a3636b  9b4c2b08b759
>
> mds.cephfs.rke-sh1-2.isqjza   rke-sh1-2  running (18h)  3m ago     23M  -
> 16.2.1   c757e4a3636b  71681a5f34d3
>
> mds.cephfs.rke-sh1-3.vdicdn   rke-sh1-3  running (17h)  36s ago    3M   -
> 16.2.1   c757e4a3636b  e89946ad6b7e
>
> mgr.rke-sh1-1.qskoyj          rke-sh1-1  running (21h)  36s ago    2y
> *:8082 *:9283  16.2.1   c757e4a3636b  7ce7cfbb3e55
>
> mgr.rke-sh1-2.lxmguj          rke-sh1-2  running (21h)  3m ago     22M
> *:8082 *:9283  16.2.1   c757e4a3636b  5a0025adfd46
>
> mgr.rke-sh1-3.ckunvo          rke-sh1-3  running (17h)  36s ago    6M
> *:8082 *:9283  16.2.1   c757e4a3636b  2fcaf18f3218
>
> mon.rke-sh1-1                 rke-sh1-1  running (20h)  36s ago    20h  -
> 16.2.1   c757e4a3636b  c0a90103cabc
>
> mon.rke-sh1-2                 rke-sh1-2  running (21h)  3m ago     3M   -
> 16.2.1   c757e4a3636b  f4b32ba4466b
>
> mon.rke-sh1-3                 rke-sh1-3  running (17h)  36s ago    17h  -
> 16.2.1   c757e4a3636b  d5e44c245998
>
> osd.0                         rke-sh1-2  running (20h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  7b0e69942c15
>
> osd.1                         rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  4451654d9a2d
>
> osd.10                        rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  3f9d5f95e284
>
> osd.11                        rke-sh1-1  running (21h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  db1cc6d2e37f
>
> osd.12                        rke-sh1-2  running (21h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  de416c1ef766
>
> osd.13                        rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  25a281cc5a9b
>
> osd.14                        rke-sh1-1  running (21h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  62f25ba61667
>
> osd.15                        rke-sh1-2  running (21h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  d3514d823c45
>
> osd.16                        rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  bba857759bfe
>
> osd.17                        rke-sh1-1  running (21h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  59281d4bb3d0
>
> osd.2                         rke-sh1-1  running (21h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  418041b5e60d
>
> osd.3                         rke-sh1-2  running (21h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  04a0e29d5623
>
> osd.4                         rke-sh1-1  running (20h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  1cc78a5153d3
>
> osd.5                         rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  39a4b11e31fb
>
> osd.6                         rke-sh1-2  running (21h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  2f218ffb566e
>
> osd.7                         rke-sh1-1  running (20h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  cf761fbe4d5f
>
> osd.8                         rke-sh1-3  running (17h)  36s ago    2y   -
> 16.2.1   c757e4a3636b  f9f85480e800
>
> osd.9                         rke-sh1-2  running (21h)  3m ago     2y   -
> 16.2.1   c757e4a3636b  664c54ff46d2
>
> rgw.default.rke-sh1-1.dgucwl  rke-sh1-1  running (21h)  36s ago    22M
> *:8000         16.2.1   c757e4a3636b  f03212b955a7
>
> rgw.default.rke-sh1-1.vylchc  rke-sh1-1  running (21h)  36s ago    22M
> *:8001         16.2.1   c757e4a3636b  da486ce43fe5
>
> rgw.default.rke-sh1-2.dfhhfw  rke-sh1-2  running (21h)  3m ago     2y
> *:8000         16.2.1   c757e4a3636b  ef4089d0aef2
>
> rgw.default.rke-sh1-2.efkbum  rke-sh1-2  running (21h)  3m ago     2y
> *:8001         16.2.1   c757e4a3636b  9e053d5a2f7b
>
> rgw.default.rke-sh1-3.krfgey  rke-sh1-3  running (17h)  36s ago    9M
> *:8001         16.2.1   c757e4a3636b  45cd3d75edd3
>
> rgw.default.rke-sh1-3.pwdbmp  rke-sh1-3  running (17h)  36s ago    9M
> *:8000         16.2.1   c757e4a3636b  e2710265a7f4
>
>
>
> ceph health detail
>
> HEALTH_WARN 1 MDSs are read only; 2 MDSs behind on trimming
>
> [WRN] MDS_READ_ONLY: 1 MDSs are read only
>
>     mds.cephfs.rke-sh1-2.isqjza(mds.0): MDS in read-only mode
>
> [WRN] MDS_TRIM: 2 MDSs behind on trimming
>
>     mds.cephfs.rke-sh1-2.isqjza(mds.0): Behind on trimming (2149/128)
> max_segments: 128, num_segments: 2149
>
>     mds.cephfs.rke-sh1-1.ojmpnk(mds.0): Behind on trimming (2149/128)
> max_segments: 128, num_segments: 2149
>
>
>
> root@rke-sh1-1:~# ceph fs status
>
> cephfs - 27 clients
>
> ======
>
> RANK      STATE                 MDS               ACTIVITY     DNS    INOS
> DIRS   CAPS
>
> 0        active      cephfs.rke-sh1-2.isqjza  Reqs:    8 /s  85.2k  53.2k
> 1742    101
>
> 0-s   standby-replay  cephfs.rke-sh1-1.ojmpnk  Evts:    0 /s  52.2k  20.2k
> 1737      0
>
>       POOL         TYPE     USED  AVAIL
>
> cephfs_metadata  metadata  1109G  6082G
>
>   cephfs_data      data    8419G  6082G
>
>       STANDBY MDS
>
> cephfs.rke-sh1-3.vdicdn
>
> MDS version: ceph version 16.2.1 
> (afb9061ab4117f798c858c741efa6390e48ccf10)
> pacific (stable)
>
>
>
> ceph status
>
>   cluster:
>
>     id:     fcb373ce-7aaa-11eb-984f-e7c6e0038e87
>
>     health: HEALTH_WARN
>
>             1 MDSs are read only
>
>             2 MDSs behind on trimming
>
>
>
>   services:
>
>     mon: 3 daemons, quorum rke-sh1-2,rke-sh1-1,rke-sh1-3 (age 17h)
>
>     mgr: rke-sh1-1.qskoyj(active, since 17h), standbys: 
> rke-sh1-2.lxmguj, rke-sh1-3.ckunvo
>
>     mds: 1/1 daemons up, 1 standby, 1 hot standby
>
>     osd: 18 osds: 18 up (since 17h), 18 in (since 20h)
>
>     rgw: 6 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
>     volumes: 1/1 healthy
>
>     pools:   11 pools, 849 pgs
>
>     objects: 10.10M objects, 5.3 TiB
>
>     usage:   11 TiB used, 15 TiB / 26 TiB avail
>
>     pgs:     849 active+clean
>
>
>
>   io:
>
>     client:   35 KiB/s rd, 1.0 MiB/s wr, 302 op/s rd, 165 op/s wr
>
>
>
>
>
> # ceph mds stat
>
> cephfs:1 {0=cephfs.rke-sh1-2.isqjza=up:active} 1 up:standby-replay 1 
> up:standby
>
>
>
> Have you got an idea on what could be my next steps to bring the 
> cluster healthy ?
>
>
>
> Help will very be appreciated.
>
>
>
> Thank a lot for your feedback.
>
>
>
> Best Regards,
>
>
>
> Edouard FAZENDA
>
> Technical Support
>
>
>
>
>
>
>
> Chemin du Curé-Desclouds 2, CH-1226 THONEX  +41 (0)22 869 04 40
>
>
>
>  <https://www.csti.ch/> www.csti.ch

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
Attachment:
smime.p7s

Description: S/MIME cryptographic signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx