On Mon, Dec 12, 2022 at 12:10 PM Sascha Lucas <ceph-users@xxxxxxxxx> wrote: > Hi Dhairya, > > On Mon, 12 Dec 2022, Dhairya Parmar wrote: > > > You might want to look at [1] for this, also I found a relevant thread > [2] > > that could be helpful. > > > > Thanks a lot. I already found [1,2], too. But I did not considered it, > because I felt not having a "disaster"? Nothing seems broken nor crashed: > all servers/services up since weeks. No disk failures, no modifications on > cluster etc. > > Also the Warning Box in [1] tells me (as a newbie) not to run anything of > this. Or in other words: not to forcefully start a disaster ;-). > > A follow-up of [2] also mentioned having random meta-data corruption: "We > have 4 clusters (all running same version) and have experienced meta-data > corruption on the majority of them at some time or the other" Jewel (and upgrading from that version) was much less stable than Luminous (when we declared the filesystem “awesome” and said the Ceph upstream considered it production-ready), and things have generally gotten better with every release since then. > > [3] tells me, that metadata damage can happen either from data loss (which > I'm convinced not to have), or from software bugs. The later would be > worth fixing. Is there a way to find the root cause? Yes, we’d very much like to understand this. What versions of the server and kernel client are you using? What platform stack — I see it looks like you are using CephFS through the volumes interface? The simplest possibility I can think of here is that you are running with a bad kernel and it used async ops poorly, maybe? But I don’t remember other spontaneous corruptions of this type anytime recent. Have you run a normal forward scrub (which is non-disruptive) to check if there are other issues? -Greg > > And is going through [1] relay the only option? It sounds being offline > for days... > > At least I know now, what dirfrags[4] are. > > Thanks, Sascha. > > [1] > https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#disaster-recovery-experts > [2] https://www.spinics.net/lists/ceph-users/msg53202.html > [3] > https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair > [4] https://docs.ceph.com/en/quincy/cephfs/dirfrags/ > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx