Re: MDS stuck in "up:replay"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Thanks for the idea!

I tried it immediately but still, MDS are in up:replay mode. So far they haven't crashed but this usually takes a few minutes.

So no effect so far. :-(

Cheers,
Thomas

On 22.02.23 17:58, Patrick Donnelly wrote:
On Wed, Jan 25, 2023 at 3:36 PM Thomas Widhalm <widhalmt@xxxxxxxxxxxxx> wrote:

Hi,

Sorry for the delay. As I told Venky directly, there seems to be a
problem with DMARC handling of the Ceph users list. So it was blocked by
the company I work for.

So I'm writing from my personal e-mail address, now.

Did I miss something?

Venky, you said, that, as soon as the underlying issue is solved, my
filesystems should come up again. Is there anything I can do to help
with solving? Or do I need to wait for the bug to be solved and then
upgrade my Ceph while CephFS is still broken?

I'm still seeing both MDS counting up seq numbers for days now. That
really puzzles me because at least one of them hasn't seen changes for
weeks before the crash.

It is likely that the MDS is not able to communicate with the OSDs if
it's stuck in up:replay. Use:

ceph config set mds debug_ms 5
ceph config set mds debug_mds 10

and

ceph fs fail X
ceph fs set X joinable true

to get fresh logs from the MDS to see what's going with the messages
to the OSDs.

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux