Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



These steps pretty well correspond to http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
Were you able to replay journal manually with no issues? IIRC, "cephfs-journal-tool recover_dentries" would lead to OOM in case of MDS doing so, and it has already been discussed on this list.


April 2, 2019 1:37 AM, "Pickett, Neale T" <neale@xxxxxxxx> wrote:

Here is what I wound up doing to fix this:

  • Bring down all MDSes so they stop flapping
  • Back up journal (as seen in previous message)
  • Apply journal manually
  • Reset journal manually
  • Clear session table
  • Clear other tables (not sure I needed to do this)
  • Mark FS down
  • Mark the rank 0 MDS as failed
  • Reset the FS (yes, I really mean it)
  • Restart MDSes
  • Finally get some sleep
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux