Re: cephfs-data-scan safety on active filesystem

Ryan Leimenstoll <rleimens@xxxxxxxxxxxxxx> · Tue, 8 May 2018 15:49:54 -0400

Hi Gregg, John, 

Thanks for the warning. It was definitely conveyed that they are dangerous. I thought the online part was implied to be a bad idea, but just wanted to verify.

John,

We were mostly operating off of what the mds logs reported. After bringing the mds back online and active, we mounted the volume using the kernel driver to one host and started a recursive ls through the root of the filesystem to see what was broken. There were seemingly two main paths of the tree that were affected initially, both reporting errors like the following in the mds log (I’ve swapped out the paths):

Group 1:
2018-05-04 12:04:38.004029 7fc81f69a700 -1 log_channel(cluster) log [ERR] : dir 0x10011125556 object missing on disk; some files may be lost (/cephfs/redacted1/path/dir1) 
2018-05-04 12:04:38.028861 7fc81f69a700 -1 log_channel(cluster) log [ERR] : dir 0x1001112bf14 object missing on disk; some files may be lost (/cephfs/redacted1/path/dir2)
2018-05-04 12:04:38.030504 7fc81f69a700 -1 log_channel(cluster) log [ERR] : dir 0x10011131118 object missing on disk; some files may be lost (/cephfs/redacted1/path/dir3) 

Group 2:
2021-05-04 13:24:29.495892 7fc81f69a700 -1 log_channel(cluster) log [ERR] : dir 0x1001102c5f6 object missing on disk; some files may be lost (/cephfs/redacted2/path/dir1) 

For some of the paths it complained about were empty via ls, although trying to rm [-r] them via the mount failed with an error suggesting files still exist in the directory. We removed the dir object in the metadata pool that it was still warning about (rados -p metapool rm 10011125556.0000, for example). This cleaned up errors on this path. We then did the same for Group 2. 

After this, we initiated a recursive scrub with the mds daemon on the root of the filesystem to run over the weekend.

In retrospect, we probably should have done the data scan steps mentioned in the disaster recovery guide before bringing the system online. The cluster is currently healthy (or, rather, reporting healthy) and has been for a while.

My understanding here is that we would need something like the cephfs-data-scan steps to recreate metadata or at least identify (for cleanup) objects that may have been stranded in the data pool. Is there anyway, likely with another tool, to do this for an active cluster? If not, is this something that can be done with some amount of safety on an offline system? (not sure how long it would take, data pool is ~100T large w/ 242 million objects, and downtime is a big pain point for our users with deadlines).

Thanks,

Ryan

> On May 8, 2018, at 5:05 AM, John Spray <jspray@xxxxxxxxxx> wrote:
> 
> On Mon, May 7, 2018 at 8:50 PM, Ryan Leimenstoll
> <rleimens@xxxxxxxxxxxxxx> wrote:
>> Hi All,
>> 
>> We recently experienced a failure with our 12.2.4 cluster running a CephFS
>> instance that resulted in some data loss due to a seemingly problematic OSD
>> blocking IO on its PGs. We restarted the (single active) mds daemon during
>> this, which caused damage due to the journal not having the chance to flush
>> back. We reset the journal, session table, and fs to bring the filesystem
>> online. We then removed some directories/inodes that were causing the
>> cluster to report damaged metadata (and were otherwise visibly broken by
>> navigating the filesystem).
> 
> This may be over-optimistic of me, but is there any chance you kept a
> detailed record of exactly what damage was reported, and what you did
> to the filesystem so far?  It's hard to give any intelligent advice on
> repairing it, when we don't know exactly what was broken, and a bunch
> of unknown repair-ish things have already manipulated the metadata
> behind the scenes.
> 
> John
> 
>> With that, there are now some paths that seem to have been orphaned (which
>> we expected). We did not run the ‘cephfs-data-scan’ tool [0] in the name of
>> getting the system back online ASAP. Now that the filesystem is otherwise
>> stable, can we initiate a scan_links operation with the mds active safely?
>> 
>> [0]
>> http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/#recovery-from-missing-metadata-objects
>> 
>> Thanks much,
>> Ryan Leimenstoll
>> rleimens@xxxxxxxxxxxxxx
>> University of Maryland Institute for Advanced Computer Studies
>> 
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com