monitor not joining quorum

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

one of our monitor VM  was rebooted and not joining quorum again (quorum consist out of 3 monitors). While monitor service (ceph1) is running on this VM, Ceph cluster become unreachable. In monitor logs on ceph3 VM  I can see a lot of  following messages:


2021-10-19 17:50:19.555 7fe49e912700  0 log_channel(audit) log [DBG] : from='client.? 10.13.68.11:0/1846917599' entity='client.admin' cmd=[{"prefix": "osd blacklist ls"}]: dispatch 2021-10-19 17:50:20.255 7fe4a1117700  1 mon.ceph3@1(leader).paxos(paxos updating c 95374479..95375018) accept timeout, calling fresh election 2021-10-19 17:50:20.255 7fe49e912700  0 log_channel(cluster) log [INF] : mon.ceph3 calling monitor election 2021-10-19 17:50:20.255 7fe49e912700  1 mon.ceph3@1(electing).elector(42748) init, last seen epoch 42748 2021-10-19 17:50:20.263 7fe49e912700 -1 mon.ceph3@1(electing) e4 failed to get devid for : fallback method has serial ''but no model 2021-10-19 17:50:21.491 7fe49b90c700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:23.567 7fe49b90c700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:23.771 7fe49b90c700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:24.175 7fe49c90e700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:24.979 7fe49c90e700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:25.223 7fe49c90e700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:25.263 7fe4a1117700  1 mon.ceph3@1(electing).elector(42749) init, last seen epoch 42749, mid-election, bumping 2021-10-19 17:50:25.271 7fe49c90e700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id 2021-10-19 17:50:25.279 7fe4a1117700 -1 mon.ceph3@1(electing) e4 failed to get devid for : fallback method has serial ''but no model 2021-10-19 17:50:25.487 7fe49c90e700  1 mon.ceph3@1(electing) e4 handle_auth_request failed to assign global_id


NTP is running on all nodes on cluster and time is in correct sync.

Any help would be appreciated.

thx!

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux