Also what is the status of you other mds ? Is it active ? Or which one was damaged? U can also look for additional mds in the same cluster possibility? On Sat, 22 May 2021, 00:40 Eugen Block, <eblock@xxxxxx> wrote: > Hi, > > I went through similar trouble just this week [1], but the root cause > seems different so it probably won't apply to your case. > Which version of ceph are you running? There are a couple of reports > with similar error messages, e. g. [2], it may already been resolved. > > Can you share > > rados list-inconsistent-obj 2.44 > > and > > ceph tell mds.<MDS> damage ls > > The pool size is 3, right? > > Regards, > Eugen > > Zitat von Sagara Wijetunga <sagarawmw@xxxxxxxxx>: > > > Hi all > > An accidental power failure happened. > > That resulted CephFS offline and cannot be mounted. > > I have 3 MDS daemons but it complains "1 mds daemon damaged". > > > > It seems a PG of cephfs_metadata is inconsistent. I tried to repair, > > but doesn't get it repaired. > > How do I repair the damaged MDS and bring the CephFS up/online? > > Details are included below. > > > > Many thanks in advance. > > Sagara > > > > > > > > > > # ceph -s > > > > cluster: > > > > id: abc... > > > > health: HEALTH_ERR > > > > 1 filesystem is degraded > > > > 1 filesystem is offline > > > > 1 mds daemon damaged > > > > 4 scrub errors > > > > Possible data damage: 1 pg inconsistent > > > > > > > > services: > > > > mon: 3 daemons, quorum a,b,c (age 107s) > > > > mgr: a(active, since 22m), standbys: b, c > > > > mds: cephfs:0/1 3 up:standby, 1 damaged > > > > osd: 3 osds: 3 up (since 96s), 3 in (since 96s) > > > > > > > > data: > > > > pools: 3 pools, 192 pgs > > > > objects: 281.05k objects, 327 GiB > > > > usage: 2.4 TiB used, 8.1 TiB / 11 TiB avail > > > > pgs: 191 active+clean > > > > 1 active+clean+inconsistent > > > > > > > > # ceph health detail > > > > HEALTH_ERR 1 filesystem is degraded; 1 filesystem is offline; 1 mds > > daemon damaged; 4 scrub errors; Possible data damage: 1 pg > > inconsistent > > > > FS_DEGRADED 1 filesystem is degraded > > > > fs cephfs is degraded > > > > MDS_ALL_DOWN 1 filesystem is offline > > > > fs cephfs is offline because no MDS is active for it. > > > > MDS_DAMAGE 1 mds daemon damaged > > > > fs cephfs mds.0 is damaged > > > > OSD_SCRUB_ERRORS 4 scrub errors > > > > PG_DAMAGED Possible data damage: 1 pg inconsistent > > > > pg 2.44 is active+clean+inconsistent, acting [0,2,1] > > > > > > > > # ceph osd lspools > > > > 2 cephfs_metadata > > > > 3 cephfs_data > > > > 4 rbd > > > > > > > > > > # ceph pg repair 2.44 > > > > > > # ceph -w > > > > 2021-05-22 01:48:04.775783 osd.0 [ERR] 2.44 shard 0 soid > > 2:22efaf6a:::200.00006048:head : candidate size 1540096 info size > > 1555896 mismatch > > > > > > 2021-05-22 01:48:04.775786 osd.0 [ERR] 2.44 shard 1 soid > > 2:22efaf6a:::200.00006048:head : candidate size 1540096 info size > > 1555896 mismatch > > > > > > 2021-05-22 01:48:04.775787 osd.0 [ERR] 2.44 shard 2 soid > > 2:22efaf6a:::200.00006048:head : candidate size 1441792 info size > > 1555896 mismatch > > > > > > 2021-05-22 01:48:04.775789 osd.0 [ERR] 2.44 soid > > 2:22efaf6a:::200.00006048:head : failed to pick suitable object info > > > > 2021-05-22 01:48:04.775849 osd.0 [ERR] repair 2.44 > > 2:22efaf6a:::200.00006048:head : on disk size (1540096) does not > > match object info size (1555896) adjusted for ondisk to (1555896) > > > > 2021-05-22 01:48:04.787167 osd.0 [ERR] 2.44 repair 4 errors, 0 fixed > > > > --- End of detail --- > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx