Hi,
I went through similar trouble just this week [1], but the root cause
seems different so it probably won't apply to your case.
Which version of ceph are you running? There are a couple of reports
with similar error messages, e. g. [2], it may already been resolved.
Can you share
rados list-inconsistent-obj 2.44
and
ceph tell mds.<MDS> damage ls
The pool size is 3, right?
Regards,
Eugen
Zitat von Sagara Wijetunga <sagarawmw@xxxxxxxxx>:
Hi all
An accidental power failure happened.
That resulted CephFS offline and cannot be mounted.
I have 3 MDS daemons but it complains "1 mds daemon damaged".
It seems a PG of cephfs_metadata is inconsistent. I tried to repair,
but doesn't get it repaired.
How do I repair the damaged MDS and bring the CephFS up/online?
Details are included below.
Many thanks in advance.
Sagara
# ceph -s
cluster:
id: abc...
health: HEALTH_ERR
1 filesystem is degraded
1 filesystem is offline
1 mds daemon damaged
4 scrub errors
Possible data damage: 1 pg inconsistent
services:
mon: 3 daemons, quorum a,b,c (age 107s)
mgr: a(active, since 22m), standbys: b, c
mds: cephfs:0/1 3 up:standby, 1 damaged
osd: 3 osds: 3 up (since 96s), 3 in (since 96s)
data:
pools: 3 pools, 192 pgs
objects: 281.05k objects, 327 GiB
usage: 2.4 TiB used, 8.1 TiB / 11 TiB avail
pgs: 191 active+clean
1 active+clean+inconsistent
# ceph health detail
HEALTH_ERR 1 filesystem is degraded; 1 filesystem is offline; 1 mds
daemon damaged; 4 scrub errors; Possible data damage: 1 pg
inconsistent
FS_DEGRADED 1 filesystem is degraded
fs cephfs is degraded
MDS_ALL_DOWN 1 filesystem is offline
fs cephfs is offline because no MDS is active for it.
MDS_DAMAGE 1 mds daemon damaged
fs cephfs mds.0 is damaged
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 2.44 is active+clean+inconsistent, acting [0,2,1]
# ceph osd lspools
2 cephfs_metadata
3 cephfs_data
4 rbd
# ceph pg repair 2.44
# ceph -w
2021-05-22 01:48:04.775783 osd.0 [ERR] 2.44 shard 0 soid
2:22efaf6a:::200.00006048:head : candidate size 1540096 info size
1555896 mismatch
2021-05-22 01:48:04.775786 osd.0 [ERR] 2.44 shard 1 soid
2:22efaf6a:::200.00006048:head : candidate size 1540096 info size
1555896 mismatch
2021-05-22 01:48:04.775787 osd.0 [ERR] 2.44 shard 2 soid
2:22efaf6a:::200.00006048:head : candidate size 1441792 info size
1555896 mismatch
2021-05-22 01:48:04.775789 osd.0 [ERR] 2.44 soid
2:22efaf6a:::200.00006048:head : failed to pick suitable object info
2021-05-22 01:48:04.775849 osd.0 [ERR] repair 2.44
2:22efaf6a:::200.00006048:head : on disk size (1540096) does not
match object info size (1555896) adjusted for ondisk to (1555896)
2021-05-22 01:48:04.787167 osd.0 [ERR] 2.44 repair 4 errors, 0 fixed
--- End of detail ---
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx