MDS Journal Replay Issues / Ceph Disaster Recovery Advice/Questions

Alex Jackson <tmb.alexander@xxxxxxxxx> · Thu, 20 Jan 2022 20:25:32 -0500

 Hello Ceph Users,

I wanted to hopefully get some advice or at least get some questions
answered about the Ceph Disaster Recovery Process detailed in the docs. The
questions I have are as follows:

- Do all the steps need to be performed or can I check the status of the
MDS after each until it recovers?

- What does the Journal truncate do? From what the name suggests it
truncates part of the journal, but from what the warnings sound like, it
might cause some unexpected data to be deleted or delete the journal
entirely.

- Where would I use the data stored from recover_dentries to rebuild the
metadata?

- What sorts of information would an "expert" need to perform a successful
disaster recovery?

Other than questions, I was hoping to get some advice on my situation and
whether I even need disaster recovery.

I recently had a power blip reset the ceph servers and they came back
barking about the CephFS MDSs being unable to start. The status listed
UP:replay. Upon further investigation, there seemed to be issues with the
journal and the MDS log had some errors in replaying.

The somewhat abridged log can be found here (abridged because it spits out
the same stuff): https://pastebin.com/FkypNkSZ

The main errors lines in my mind, though, are:

Jan 19 13:28:26 nxpmn01 ceph-mds[313765]: -3> 2022-01-19T13:28:26.091-0500
7f80a0ba7700 -1 log_channel(cluster) log [ERR] : journal replay inotablev
mismatch 2 -> 2417

Jan 19 13:28:26 nxpmn01 ceph-mds[313765]: -2> 2022-01-19T13:28:26.091-0500
7f80a0ba7700 -1 log_channel(cluster) log [ERR] : EMetaBlob.replay
sessionmap v 1160787 - 1 > table 0

Everything I've found online says I might need a journal truncate. I was
hoping to avoid it coming to that, though, as I'm not an "expert" as
mentioned in the Disaster Recovery docs.

Relevant info about my Ceph setup:

- 3 servers running Proxmox 6.4-13 and Ceph 15.2.10

- ceph -s returns:

cluster: id: 642c8584-f642-4043-a43d-a984bbf75603 health: HEALTH_WARN 1
filesystem is degraded insufficient standby MDS daemons available 99
daemons have recently crashed services: mon: 3 daemons, quorum
nxpmn01,nxpmn02,nxpmn03 (age 5d) mgr: nxpmn02(active, since 9d), standbys:
nxpmn03, nxpmn01 mds: cephfs:1/1 {0=nxpmn01=up:replay(laggy or crashed)}
osd: 18 osds: 18 up (since 5d), 18 in (since 3w) data: pools: 5 pools, 209
pgs objects: 4.25M objects, 16 TiB usage: 23 TiB used, 28 TiB / 51 TiB
avail pgs: 209 active+clean

- OSDs are UP and IN

- To my knowledge CephFS has only 1 rank (rank 0?)

Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx