Re: Reasonable MDS rejoin time?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's hard to suggest without the logs. Do verbose logging debug_mds=20. What's the ceph version? Do you have the logs why the MDS crashed?

On 16/05/22 11:20, Felix Lee wrote:
Dear all,
We currently have 7 multi-active MDS, with another 7 standby-replay.
We thought this should cover most of disasters, and it actually did. But things just got happened, here is the story: One of MDS crashed and standby-replay took over, but got stuck at resolve state. Then, the other two MDS(rank 0 and 5) received tones of slow requests, and my colleague restarted them, thinking the standby-replay would take over immediately (this seemed to be wrong or at least unnecessary action, I guess...). Then, it resulted three of them in resolve state... In the meanwhile, I realized that the first failed rank(rank 2) had abnormal memory usage and kept getting crashed, after couple restarting, the memory usage was back to normal, and then, those tree MDS entered into rejoin state. Now, this rejoin state is there for three days and keeps going as we're speaking. Here, no significant error message shows up even with "debug_mds 10", so, we have no idea when it's gonna end and if it's really running on the track. So, I am wondering how do we check MDS rejoin progress/status to make sure if it's running normally? Or, how do we estimate the rejoin time and maybe improve it? because we always need to tell user the time estimation of its recovery.


Thanks
&
Best regards,
Felix Lee ~


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux