Re: Feedback on docs after MDS damage/journal corruption

John Spray <jspray@xxxxxxxxxx> · Tue, 11 Oct 2016 12:30:07 +0100

On Tue, Oct 11, 2016 at 12:00 PM, Henrik Korkuc <lists@xxxxxxxxx> wrote:
> Hey,
>
> After a bright idea to pause 10.2.2 Ceph cluster for a minute to see if it
> will speed up backfill I managed to corrupt my MDS journal (should it happen
> after cluster pause/unpause, or is it some sort of a bug?). I had "Overall
> journal integrity: DAMAGED", etc

Uh, pause/unpausing your RADOS cluster should never do anything apart
from pausing IO.  That's DEFINITELY a severe bug if it corrupted
objects!

> I was following http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/
> and have some questions/feedback:

Caveat: This is a difficult area to document, because the repair tools
interfere with internal on-disk structures.  If I can use a bad
metaphor: it's like being in an auto garage, and asking for
documentation about the tools -- the manual for the wrench doesn't
tell you anything about how to fix the car engine.  Similarly it's
hard to write useful documentation about the repair tools without also
writing a detailed manual for how all the cephfs internals work.

> * It would be great to have some info when ‘snap’ or ‘inode’ should be reset

You would reset these tables if you knew that for some reason they no
longer matched the reality elsewhere in the metadata.

> * It is not clear when MDS start should be attempted

You would start the MDS when you believed that you had done all you
could with offline repair.  Everything on the "disaster recovery" page
is about offline tools.

> * Can scan_extents/scan_inodes be run after MDS is running?

These are meant only for offline use.  You could in principle run
scan_extents while an MDS was running as long as you had no data
writes going on.  scan_inodes writes directly into the metadata pool
so is certainly not safe to run at the same time as an active MDS.

> * "online MDS scrub" is mentioned in docs. Is it scan_extents/scan_inodes or
> some other command?

That refers to the "forward scrub" functionality inside the MDS,
that's invoked with "scrub_path" or "tag path" commands.

> Now CephFS seems to be working (I have "mds0: Metadata damage detected" but
> scan_extends is currently running), let's see what happens when I finish
> scan_extends/scan_inodes.
>
> Will these actions solve possible orphaned objects in pools? What else
> should I look into?

A full offline scan_extents/scan_inodes run should re-link orphans
into a top-level lost+found directory (from which you can subsequently
delete them when your MDS is back online).

John

>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com