Re: MDS_DAMAGE dir_frag

Sascha Lucas <ceph-users@xxxxxxxxx> · Mon, 12 Dec 2022 21:09:46 +0100 (CET)

Hi Dhairya,

On Mon, 12 Dec 2022, Dhairya Parmar wrote:

You might want to look at [1] for this, also I found a relevant thread [2]
that could be helpful.

Thanks a lot. I already found [1,2], too. But I did not considered it, 
because I felt not having a "disaster"? Nothing seems broken nor crashed: 
all servers/services up since weeks. No disk failures, no modifications on 
cluster etc.

Also the Warning Box in [1] tells me (as a newbie) not to run anything of 
this. Or in other words: not to forcefully start a disaster ;-).

A follow-up of [2] also mentioned having random meta-data corruption: "We 
have 4 clusters (all running same version) and have experienced meta-data 
corruption on the majority of them at some time or the other"

[3] tells me, that metadata damage can happen either from data loss (which 
I'm convinced not to have), or from software bugs. The later would be 
worth fixing. Is there a way to find the root cause?

And is going through [1] relay the only option? It sounds being offline 
for days...

At least I know now, what dirfrags[4] are.

Thanks, Sascha.

[1] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#disaster-recovery-experts
[2] https://www.spinics.net/lists/ceph-users/msg53202.html
[3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair
[4] https://docs.ceph.com/en/quincy/cephfs/dirfrags/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx