Re: MDS in ReadOnly and 2 MDS behind on trimming

Eugen Block <eblock@xxxxxx> · Fri, 23 Feb 2024 14:04:43 +0000

2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 log_channel(cluster)  
log [ERR] : failed to commit dir 0x1 object, errno -22
2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 mds.0.12487 unhandled  
write error (22) Invalid argument, force readonly...

Was your cephfs metadata pool full? This tracker  
(https://tracker.ceph.com/issues/52260) sounds very similar but I  
don't see a solution for it.

Zitat von Edouard FAZENDA <e.fazenda@xxxxxxx>:

Hi Eugen,

Thanks for the reply, really appreciate

The first command , just hang with no output
# cephfs-journal-tool --rank=cephfs:0 --journal=mdlog journal inspect

The second command

# cephfs-journal-tool --rank=cephfs:0 --journal=purge_queue journal inspect
Overall journal integrity: OK

root@rke-sh1-2:~# cephadm logs --fsid  
fcb373ce-7aaa-11eb-984f-e7c6e0038e87 --name  
mds.cephfs.rke-sh1-2.isqjza
-- Logs begin at Fri 2024-02-23 04:49:32 UTC, end at Fri 2024-02-23  
13:08:22 UTC. --
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: ignoring --setuser ceph  
since I am not root
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: ignoring --setgroup ceph  
since I am not root
Feb 23 07:46:46 rke-sh1-2 bash[1058012]: starting  
mds.cephfs.rke-sh1-2.isqjza at
Feb 23 08:15:06 rke-sh1-2 bash[1058012]: debug  
2024-02-23T08:15:06.371+0000 7fbc17dd9700 -1 mds.pinger  
is_rank_lagging: rank=0 was never sent ping request.
Feb 23 08:15:13 rke-sh1-2 bash[1058012]: debug  
2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 log_channel(cluster)  
log [ERR] : failed to commit dir 0x1 object, errno -22
Feb 23 08:15:13 rke-sh1-2 bash[1058012]: debug  
2024-02-23T08:15:13.155+0000 7fbc145d2700 -1 mds.0.12487 unhandled  
write error (22) Invalid argument, force readonly...
Feb 23 10:20:36 rke-sh1-2 bash[1058012]: debug  
2024-02-23T10:20:36.309+0000 7fbc17dd9700 -1 mds.pinger  
is_rank_lagging: rank=1 was never sent ping request.

root@rke-sh1-3:~# cephadm logs --fsid  
fcb373ce-7aaa-11eb-984f-e7c6e0038e87 --name  
mds.cephfs.rke-sh1-3.vdicdn
-- Logs begin at Fri 2024-02-23 06:59:48 UTC, end at Fri 2024-02-23  
13:09:18 UTC. --
Feb 23 07:46:46 rke-sh1-3 bash[2901]: ignoring --setuser ceph since  
I am not root
Feb 23 07:46:46 rke-sh1-3 bash[2901]: ignoring --setgroup ceph since  
I am not root
Feb 23 07:46:46 rke-sh1-3 bash[2901]: starting mds.cephfs.rke-sh1-3.vdicdn at
Feb 23 10:25:51 rke-sh1-3 bash[2901]: ignoring --setuser ceph since  
I am not root
Feb 23 10:25:51 rke-sh1-3 bash[2901]: ignoring --setgroup ceph since  
I am not root
Feb 23 10:25:51 rke-sh1-3 bash[2901]: starting mds.cephfs.rke-sh1-3.vdicdn at

debug2: channel 0: request window-change confirm 0
debug3: send packet: type 98
-- Logs begin at Fri 2024-02-23 00:24:42 UTC, end at Fri 2024-02-23  
13:09:55 UTC. --
Feb 23 09:29:10 rke-sh1-1 bash[786820]: tcmalloc: large alloc  
1073750016 bytes == 0x5598512de000 @  0x7fb426636760 0x7fb426657c64  
0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4  
0x7fb41da6>
Feb 23 09:29:19 rke-sh1-1 bash[786820]: tcmalloc: large alloc  
2147491840 bytes == 0x559891ae0000 @  0x7fb426636760 0x7fb426657c64  
0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4  
0x7fb41db3>
Feb 23 09:29:26 rke-sh1-1 bash[786820]: tcmalloc: large alloc  
2147491840 bytes == 0x559951ae4000 @  0x7fb426636760 0x7fb426657c64  
0x5597c1ccaaba 0x7fb41bc04218 0x7fb41bc0ed5b 0x7fb41bbfeda4  
0x7fb41da6>
Feb 23 09:29:27 rke-sh1-1 bash[786820]: debug  
2024-02-23T09:29:27.928+0000 7fb416d63700 -1 asok(0x5597c3904000)  
AdminSocket: error writing response length (32) Broken pipe
Feb 23 12:35:53 rke-sh1-1 bash[786820]: ignoring --setuser ceph  
since I am not root
Feb 23 12:35:53 rke-sh1-1 bash[786820]: ignoring --setgroup ceph  
since I am not root
Feb 23 12:35:53 rke-sh1-1 bash[786820]: starting  
mds.cephfs.rke-sh1-1.ojmpnk at

The logs of the MDS are in verbose 20 , do you want me to provide on  
a archive ?

Is there a way to compact all the logs ?

Best Regards,

Edouard FAZENDA
Technical Support

Chemin du Curé-Desclouds 2, CH-1226 THONEX  +41 (0)22 869 04 40

www.csti.ch

-----Original Message-----
From: Eugen Block <eblock@xxxxxx>
Sent: vendredi, 23 février 2024 12:50
To: ceph-users@xxxxxxx
Subject:  Re: MDS in ReadOnly and 2 MDS behind on trimming

Hi,

the mds log should contain information why it goes into read-only  
mode. Just a few weeks ago I helped a user with a broken CephFS (MDS  
went into read-only mode because of missing objects in the journal).
Can you check the journal status:

# cephfs-journal-tool --rank=cephfs:0 --journal=mdlog journal inspect

# cephfs-journal-tool --rank=cephfs:0 --journal=purge_queue journal inspect

and also share the logs.

Thanks,
Eugen

Zitat von Edouard FAZENDA <e.fazenda@xxxxxxx>:

Dear Ceph Community,

I am having an issue with my Ceph Cluster , there were several osd
crashing but now active and recovery finished and now the CephFS
filesystem cannot be access by clients in RW (K8S worklod) as the 1
MDS is in Read-Only and 2 are being on trimming

The cephfs seems to have volume OK

The trimming process seems not going further, maybe stuck ?

We are running 3 hosts using ceph Pacific version 16.2.1

Here some logs on the situation :

ceph versions

{

    "mon": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 3

    },

    "mgr": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 3

    },

    "osd": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 18

    },

    "mds": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 3

    },

    "rgw": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 6

    },

    "overall": {

        "ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)": 33

    }

}

ceph orch ps

NAME                          HOST       STATUS         REFRESHED  AGE
PORTS          VERSION  IMAGE ID      CONTAINER ID

crash.rke-sh1-1               rke-sh1-1  running (21h)  36s ago    21h  -
16.2.1   c757e4a3636b  e8652edb2b49

crash.rke-sh1-2               rke-sh1-2  running (21h)  3m ago     20M  -
16.2.1   c757e4a3636b  a1249a605ee0

crash.rke-sh1-3               rke-sh1-3  running (17h)  36s ago    17h  -
16.2.1   c757e4a3636b  026667bc1776

mds.cephfs.rke-sh1-1.ojmpnk   rke-sh1-1  running (18h)  36s ago    4M   -
16.2.1   c757e4a3636b  9b4c2b08b759

mds.cephfs.rke-sh1-2.isqjza   rke-sh1-2  running (18h)  3m ago     23M  -
16.2.1   c757e4a3636b  71681a5f34d3

mds.cephfs.rke-sh1-3.vdicdn   rke-sh1-3  running (17h)  36s ago    3M   -
16.2.1   c757e4a3636b  e89946ad6b7e

mgr.rke-sh1-1.qskoyj          rke-sh1-1  running (21h)  36s ago    2y
*:8082 *:9283  16.2.1   c757e4a3636b  7ce7cfbb3e55

mgr.rke-sh1-2.lxmguj          rke-sh1-2  running (21h)  3m ago     22M
*:8082 *:9283  16.2.1   c757e4a3636b  5a0025adfd46

mgr.rke-sh1-3.ckunvo          rke-sh1-3  running (17h)  36s ago    6M
*:8082 *:9283  16.2.1   c757e4a3636b  2fcaf18f3218

mon.rke-sh1-1                 rke-sh1-1  running (20h)  36s ago    20h  -
16.2.1   c757e4a3636b  c0a90103cabc

mon.rke-sh1-2                 rke-sh1-2  running (21h)  3m ago     3M   -
16.2.1   c757e4a3636b  f4b32ba4466b

mon.rke-sh1-3                 rke-sh1-3  running (17h)  36s ago    17h  -
16.2.1   c757e4a3636b  d5e44c245998

osd.0                         rke-sh1-2  running (20h)  3m ago     2y   -
16.2.1   c757e4a3636b  7b0e69942c15

osd.1                         rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  4451654d9a2d

osd.10                        rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  3f9d5f95e284

osd.11                        rke-sh1-1  running (21h)  36s ago    2y   -
16.2.1   c757e4a3636b  db1cc6d2e37f

osd.12                        rke-sh1-2  running (21h)  3m ago     2y   -
16.2.1   c757e4a3636b  de416c1ef766

osd.13                        rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  25a281cc5a9b

osd.14                        rke-sh1-1  running (21h)  36s ago    2y   -
16.2.1   c757e4a3636b  62f25ba61667

osd.15                        rke-sh1-2  running (21h)  3m ago     2y   -
16.2.1   c757e4a3636b  d3514d823c45

osd.16                        rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  bba857759bfe

osd.17                        rke-sh1-1  running (21h)  36s ago    2y   -
16.2.1   c757e4a3636b  59281d4bb3d0

osd.2                         rke-sh1-1  running (21h)  36s ago    2y   -
16.2.1   c757e4a3636b  418041b5e60d

osd.3                         rke-sh1-2  running (21h)  3m ago     2y   -
16.2.1   c757e4a3636b  04a0e29d5623

osd.4                         rke-sh1-1  running (20h)  36s ago    2y   -
16.2.1   c757e4a3636b  1cc78a5153d3

osd.5                         rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  39a4b11e31fb

osd.6                         rke-sh1-2  running (21h)  3m ago     2y   -
16.2.1   c757e4a3636b  2f218ffb566e

osd.7                         rke-sh1-1  running (20h)  36s ago    2y   -
16.2.1   c757e4a3636b  cf761fbe4d5f

osd.8                         rke-sh1-3  running (17h)  36s ago    2y   -
16.2.1   c757e4a3636b  f9f85480e800

osd.9                         rke-sh1-2  running (21h)  3m ago     2y   -
16.2.1   c757e4a3636b  664c54ff46d2

rgw.default.rke-sh1-1.dgucwl  rke-sh1-1  running (21h)  36s ago    22M
*:8000         16.2.1   c757e4a3636b  f03212b955a7

rgw.default.rke-sh1-1.vylchc  rke-sh1-1  running (21h)  36s ago    22M
*:8001         16.2.1   c757e4a3636b  da486ce43fe5

rgw.default.rke-sh1-2.dfhhfw  rke-sh1-2  running (21h)  3m ago     2y
*:8000         16.2.1   c757e4a3636b  ef4089d0aef2

rgw.default.rke-sh1-2.efkbum  rke-sh1-2  running (21h)  3m ago     2y
*:8001         16.2.1   c757e4a3636b  9e053d5a2f7b

rgw.default.rke-sh1-3.krfgey  rke-sh1-3  running (17h)  36s ago    9M
*:8001         16.2.1   c757e4a3636b  45cd3d75edd3

rgw.default.rke-sh1-3.pwdbmp  rke-sh1-3  running (17h)  36s ago    9M
*:8000         16.2.1   c757e4a3636b  e2710265a7f4

ceph health detail

HEALTH_WARN 1 MDSs are read only; 2 MDSs behind on trimming

[WRN] MDS_READ_ONLY: 1 MDSs are read only

    mds.cephfs.rke-sh1-2.isqjza(mds.0): MDS in read-only mode

[WRN] MDS_TRIM: 2 MDSs behind on trimming

    mds.cephfs.rke-sh1-2.isqjza(mds.0): Behind on trimming (2149/128)
max_segments: 128, num_segments: 2149

    mds.cephfs.rke-sh1-1.ojmpnk(mds.0): Behind on trimming (2149/128)
max_segments: 128, num_segments: 2149

root@rke-sh1-1:~# ceph fs status

cephfs - 27 clients

======

RANK      STATE                 MDS               ACTIVITY     DNS    INOS
DIRS   CAPS

0        active      cephfs.rke-sh1-2.isqjza  Reqs:    8 /s  85.2k  53.2k
1742    101

0-s   standby-replay  cephfs.rke-sh1-1.ojmpnk  Evts:    0 /s  52.2k  20.2k
1737      0

      POOL         TYPE     USED  AVAIL

cephfs_metadata  metadata  1109G  6082G

  cephfs_data      data    8419G  6082G

      STANDBY MDS

cephfs.rke-sh1-3.vdicdn

MDS version: ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10)
pacific (stable)

ceph status

  cluster:

    id:     fcb373ce-7aaa-11eb-984f-e7c6e0038e87

    health: HEALTH_WARN

            1 MDSs are read only

            2 MDSs behind on trimming

  services:

    mon: 3 daemons, quorum rke-sh1-2,rke-sh1-1,rke-sh1-3 (age 17h)

    mgr: rke-sh1-1.qskoyj(active, since 17h), standbys:
rke-sh1-2.lxmguj, rke-sh1-3.ckunvo

    mds: 1/1 daemons up, 1 standby, 1 hot standby

    osd: 18 osds: 18 up (since 17h), 18 in (since 20h)

    rgw: 6 daemons active (3 hosts, 1 zones)

  data:

    volumes: 1/1 healthy

    pools:   11 pools, 849 pgs

    objects: 10.10M objects, 5.3 TiB

    usage:   11 TiB used, 15 TiB / 26 TiB avail

    pgs:     849 active+clean

  io:

    client:   35 KiB/s rd, 1.0 MiB/s wr, 302 op/s rd, 165 op/s wr

# ceph mds stat

cephfs:1 {0=cephfs.rke-sh1-2.isqjza=up:active} 1 up:standby-replay 1
up:standby

Have you got an idea on what could be my next steps to bring the
cluster healthy ?

Help will very be appreciated.

Thank a lot for your feedback.

Best Regards,

Edouard FAZENDA

Technical Support

Chemin du Curé-Desclouds 2, CH-1226 THONEX  +41 (0)22 869 04 40

 <https://www.csti.ch/> www.csti.ch

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an  
email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx