(Thanks Yan for confirming fix - we'll implement now) @Marc Yep - x3 replica on meta-data pools We have 4 clusters (all running same version) and have experienced meta-data corruption on the majority of them at some time or the other - normally a scan fixes - I suspect due to the use case - think LAMP stacks with various drupal/wordpress caching plugins - which are running within openshift containers and utilising CephFS as a storage backend. These clusters have all been life-cycled up from Jewel if that matters. Example; # ceph osd dump | grep metadata pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 528539 flags hashpspool stripe_width 0 application cephfs Only other thing of note is on this particular cluster the meta-data pool is quite large for the number of files - see below re 281GB - other clusters metadata is a lot smaller for similar dataset. # ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 66.9TiB 29.1TiB 37.8TiB 56.44 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd 0 8.86TiB 64.57 4.86TiB 2637530 cephfs_data 1 2.59TiB 34.72 4.86TiB 25341863 cephfs_metadata 2 281GiB 5.34 4.86TiB 6755178 Cheers, James On 04/06/2019, 08:59, "Marc Roos" <M.Roos@xxxxxxxxxxxxxxxxx> wrote: How did this get damaged? You had 3x replication on the pool? -----Original Message----- From: Yan, Zheng [mailto:ukernel@xxxxxxxxx] Sent: dinsdag 4 juni 2019 1:14 To: James Wilkins Cc: ceph-users Subject: Re: CEPH MDS Damaged Metadata - recovery steps On Mon, Jun 3, 2019 at 3:06 PM James Wilkins <james.wilkins@xxxxxxxxxxxxx> wrote: > > Hi all, > > After a bit of advice to ensure we’re approaching this the right way. > > (version: 12.2.12, multi-mds, dirfrag is enabled) > > We have corrupt meta-data as identified by ceph > > health: HEALTH_ERR > 2 MDSs report damaged metadata > > Asking the mds via damage ls > > { > "damage_type": "dir_frag", > "id": 2265410500, > "ino": 2199349051809, > "frag": "*", > "path": "/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html/s hop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1" > } > > > We’ve done the steps outlined here -> > http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/ namely > > cephfs-journal-tool –fs:all journal reset (both ranks) > cephfs-data-scan scan extents / inodes / links has completed > > However when attempting to access the named folder we get: > > 2019-05-31 03:16:04.792274 7f56f6fb5700 -1 log_channel(cluster) log > [ERR] : dir 0x200136b41a1 object missing on disk; some files may be > lost > (/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html > /shop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1) > > We get this error followed shortly by an MDS failover > > Two questions really > > What’s not immediately clear from the documentation is should we/do we also need to run the below? > > # Session table > cephfs-table-tool 0 reset session > # SnapServer > cephfs-table-tool 0 reset snap > # InoTable > cephfs-table-tool 0 reset inode > # Root inodes ("/" and MDS directory) > cephfs-data-scan init > No, don't do this. > And secondly – our current train of thought is we need to grab the inode number of the parent folder and delete this from the metadata pool via rados rmomapkey – is this correct? > Yes, find inode number of directory 21832. check if omap key '1_head' exist in object <inode of directory in hex>.00000000. If it exists, remove it. > Any input appreciated > > Cheers, > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com