Hi Sascha, On Tue, Dec 13, 2022 at 6:43 PM Sascha Lucas <ceph-users@xxxxxxxxx> wrote: > > Hi, > > On Mon, 12 Dec 2022, Sascha Lucas wrote: > > > On Mon, 12 Dec 2022, Gregory Farnum wrote: > > >> Yes, we’d very much like to understand this. What versions of the server > >> and kernel client are you using? What platform stack — I see it looks like > >> you are using CephFS through the volumes interface? The simplest > >> possibility I can think of here is that you are running with a bad kernel > >> and it used async ops poorly, maybe? But I don’t remember other spontaneous > >> corruptions of this type anytime recent. > > > > Ceph "servers" like MONs, OSDs, MDSs etc. are all 17.2.5/cephadm/podman. The > > filesystem kernel clients are co-located on the same hosts running the > > "servers". For some other reason OS is still RHEL 8.5 (yes with community > > ceph). Kernel is 4.18.0-348.el8.x86_64 from release media. Just one > > filesystem kernel client is at 4.18.0-348.23.1.el8_5.x86_64 from EOL of 8.5. > > > > Are there known issues with this kernel versions? > > > >> Have you run a normal forward scrub (which is non-disruptive) to check if > >> there are other issues? > > > > So far I haven't dared, but will do so tomorrow. > > Just an update: "scrub / recursive,repair" does not uncover additional > errors. But also does not fix the single dirfrag error. File system scrub does not clear entries from the damage list. The damage type you are running into ("dir_frag") implies that the object for directory "V_7770505" is lost (from the metadata pool). This results in files under that directory to be unavailable. Good news is that you can regenerate the lost object by scanning the data pool. This is documented here: https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects (You'd need not run the cephfs-table-tool or cephfs-journal-tool command though. Also, this could take time if you have lots of objects in the data pool) Since you mention that you do not see directory "CV_MAGNETIC" and no other scrub errors are seen, it's possible that the application using cephfs removed it since it was no longer needed (the data pool might have some leftover object though). > > Thanks, Sascha. > > [2] https://www.spinics.net/lists/ceph-users/msg53202.html > [3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx