Hi,
do you have log output from the read-only MDS, probably in debug mode?
Zitat von kreept.sama@xxxxxxxxx:
Hello everyone and sorry. Maybe someone has already faced this problem.
A day ago, we restored our Openshift cluster, however, at the
moment, the PVCs cannot connect to the pod. We looked at the status
of the ceph and found that our MDS were in standby mode, then found
that the metadata was corrupted. After some manipulations, we were
able to turn on our MDS daemons, but there is still no record on the
cluster, the ceph status command shows the following.
sh-4.4$ ceph -s
cluster:
id: 9213604e-b0b6-49d5-bcb3-f55ab3d79119
health: HEALTH_ERR
1 MDSs report damaged metadata
1 MDSs are read only
6 daemons have recently crashed
services:
mon: 5 daemons, quorum bd,bj,bm,bn,bo (age 26h)
mgr: a(active, since 25h)
mds: 1/1 daemons up, 1 hot standby
osd: 9 osds: 9 up (since 41h), 9 in (since 42h)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 10 pools, 225 pgs
objects: 1.60M objects, 234 GiB
usage: 606 GiB used, 594 GiB / 1.2 TiB avail
pgs: 225 active+clean
io:
client: 852 B/s rd, 1 op/s rd, 0 op/s wr
Now we trying to follow this instructions:
https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects
What else have we tried:
cephfs-journal-tool --rank=1:0 event recover_dentries summary
cephfs-journal-tool --rank=1:0 journal reset
cephfs-table-tool all reset session
ceph tell mds.gml--cephfs-a scrub start / recursive repair force
ceph tell mds.gml--cephfs-b scrub start / recursive repair force
ceph mds repaired 0
ceph tell mds.gml--cephfs-a damage ls
[
{
"damage_type": "dir_frag",
"id": 26851730,
"ino": 1100162409473,
"frag": "*",
"path":
"/volumes/csi/csi-vol-5ad18c03-3205-11ed-9ba7-0a580a810206/e5664004-51e0-4bff-85c8-029944b431d8/store/096/096a1497-78ab-4802-a5a7-d09e011fd3a5/202301_1027796_1027796_0"
},
………
{
"damage_type": "dir_frag",
"id": 118336643,
"ino": 1100162424469,
"frag": "*",
"path":
"/volumes/csi/csi-vol-5ad18c03-3205-11ed-9ba7-0a580a810206/e5664004-51e0-4bff-85c8-029944b431d8/store/096/096a1497-78ab-4802-a5a7-d09e011fd3a5/202301_1027832_1027832_0"
},
Now we trying:
# Session table
cephfs-table-tool 0 reset session
# SnapServer
cephfs-table-tool 0 reset snap
# InoTable
cephfs-table-tool 0 reset inode
# Journal
cephfs-journal-tool --rank=0 journal reset
# Root inodes ("/" and MDS directory)
cephfs-data-scan init
cephfs-data-scan scan_extents <data pool>
cephfs-data-scan scan_inodes <data pool>
cephfs-data-scan scan_links
Is it right way and cant it be our salvation?
Thank you!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx