These steps pretty well correspond to http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
Were you able to replay journal manually with no issues? IIRC, "cephfs-journal-tool recover_dentries" would lead to OOM in case of MDS doing so, and it has already been discussed on this list.
April 2, 2019 1:37 AM, "Pickett, Neale T" <neale@xxxxxxxx> wrote:
Were you able to replay journal manually with no issues? IIRC, "cephfs-journal-tool recover_dentries" would lead to OOM in case of MDS doing so, and it has already been discussed on this list.
April 2, 2019 1:37 AM, "Pickett, Neale T" <neale@xxxxxxxx> wrote:
Here is what I wound up doing to fix this:
- Bring down all MDSes so they stop flapping
- Back up journal (as seen in previous message)
- Apply journal manually
- Reset journal manually
- Clear session table
- Clear other tables (not sure I needed to do this)
- Mark FS down
- Mark the rank 0 MDS as failed
- Reset the FS (yes, I really mean it)
- Restart MDSes
- Finally get some sleep
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com