We just upgraded our cluster from Lumious to Nautilus and after a few days one of our MDS servers is getting: 2021-03-28 18:06:32.304 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 16 2021-03-28 18:06:32.304 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:32.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 16 rtt 0.00400001 2021-03-28 18:06:36.308 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 17 2021-03-28 18:06:36.308 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:36.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 17 rtt 0 2021-03-28 18:06:37.788 7f57c900a700 0 auth: could not find secret_id=34586 2021-03-28 18:06:37.788 7f57c900a700 0 cephx: verify_authorizer could not get service secret for service mds secret_id=34586 2021-03-28 18:06:37.788 7f57c6004700 5 mds.sun-gcs01-mds02 ms_handle_reset on v2:10.65.101.13:46566/0 2021-03-28 18:06:40.308 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 18 2021-03-28 18:06:40.308 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:40.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 18 rtt 0 2021-03-28 18:06:44.304 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 19 2021-03-28 18:06:44.304 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s I've tried removing the /var/lib/ceph/mds/ directory and getting the key again. I've removed the key and generated a new one, I've checked the clocks between all the nodes. From what I can tell, everything is good. We did have an issue where the monitor cluster fell over and would not boot. We reduced the monitors to a single monitor, disabled cephx, pulled it off the network and restarted the service a few times which allowed it to come up. We then expanded back to three mons and reenabled cephx and everything has been good until this. No other services seem to be suffering from this and it even appears that the MDS works okay even with these messages. We would like to figure out how to resolve this. Thank you, Robert LeBlanc ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx