Hi - We keep on getting errors like these on specific OSDs with Nautilus (14.2.16): 2021-01-29 06:14:19.174 7fbeaab92c00 -1 osd.8 12568359 unable to obtain rotating service keys; retrying 2021-01-29 06:14:49.173 7fbeaab92c00 0 monclient: wait_auth_rotating timed out after 30 2021-01-29 06:14:49.173 7fbeaab92c00 -1 osd.8 12568359 unable to obtain rotating service keys; retrying 2021-01-29 06:15:19.173 7fbeaab92c00 0 monclient: wait_auth_rotating timed out after 30 2021-01-29 06:15:19.173 7fbeaab92c00 -1 osd.8 12568359 unable to obtain rotating service keys; retrying 2021-01-29 06:15:49.174 7fbeaab92c00 0 monclient: wait_auth_rotating timed out after 30 2021-01-29 06:15:49.174 7fbeaab92c00 -1 osd.8 12568359 unable to obtain rotating service keys; retrying 2021-01-29 06:15:49.174 7fbeaab92c00 -1 osd.8 12568359 init wait_auth_rotating timed out >From googling it seems like it could be a variety of things. We do think time is in sync. It is particularly perplexing as we'll have a single OSD get this error while all other OSDs on the same node are fine. It seems exactly like this: https://tracker.ceph.com/issues/17170 Stopping the managers and restarting the mons fixes it temporarily. >From this old thread we do have msgr2 enabled: https://www.spinics.net/lists/ceph-users/msg60631.html This blog seems to point to storage slowness being the root cause in there env: http://www.florentflament.com/blog/ceph-monitor-status-switching-due-to-slow-ios.html Any advice for sorting out what is causing this? Thanks, Will _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx