Re: MDS_DAMAGE dir_frag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

On Mon, 12 Dec 2022, Gregory Farnum wrote:

On Mon, Dec 12, 2022 at 12:10 PM Sascha Lucas <ceph-users@xxxxxxxxx> wrote:

A follow-up of [2] also mentioned having random meta-data corruption: "We
have 4 clusters (all running same version) and have experienced meta-data
corruption on the majority of them at some time or the other"


Jewel (and upgrading from that version) was much less stable than Luminous
(when we declared the filesystem “awesome” and said the Ceph upstream
considered it production-ready), and things have generally gotten better
with every release since then.

I see. The cited corruption belongs to older releases...

[3] tells me, that metadata damage can happen either from data loss (which
I'm convinced not to have), or from software bugs. The later would be
worth fixing. Is there a way to find the root cause?


Yes, we’d very much like to understand this. What versions of the server
and kernel client are you using? What platform stack — I see it looks like
you are using CephFS through the volumes interface? The simplest
possibility I can think of here is that you are running with a bad kernel
and it used async ops poorly, maybe? But I don’t remember other spontaneous
corruptions of this type anytime recent.

Ceph "servers" like MONs, OSDs, MDSs etc. are all 17.2.5/cephadm/podman. The filesystem kernel clients are co-located on the same hosts running the "servers". For some other reason OS is still RHEL 8.5 (yes with community ceph). Kernel is 4.18.0-348.el8.x86_64 from release media. Just one filesystem kernel client is at 4.18.0-348.23.1.el8_5.x86_64 from EOL of 8.5.

Are there known issues with this kernel versions?

Have you run a normal forward scrub (which is non-disruptive) to check if
there are other issues?

So far I haven't dared, but will do so tomorrow.

Thanks, Sascha.

[2] https://www.spinics.net/lists/ceph-users/msg53202.html
[3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux