As oddly as it drifted away it came back. Next time, should there be a next time, I will snag logs as suggested by Sascha.
The window for all this was, local time: 9:02 am - disassociated; 11:20 pm - associated. No changes were made, I did reboot the mon02 host at 1 pm. No other network or host issues were observed in the rest of the cluster or at the site.
Thank you for your replies and I'll gather better loggin next time.
peter
From: Brad Hubbard <bhubbard@xxxxxxxxxx>
Date: Wednesday, January 8, 2020 at 6:21 PM
To: Peter Eisch <peter.eisch@xxxxxxxxxxxxxxx>
Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: monitor ghosted
On Thu, Jan 9, 2020 at 5:48 AM Peter Eisch <mailto:peter.eisch@xxxxxxxxxxxxxxx> wrote:
Hi,
This morning one of my three monitor hosts got booted from the Nautilus 14.2.4 cluster and it won’t regain. There haven’t been any changes, or events at this site at all. The conf file is the [unchanged] and the same as the other two monitors. The host is also running the MDS and MGR apps without any issue. The ceph-mon log shows this repeating:
2020-01-08 13:33:29.403 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
2020-01-08 13:33:29.433 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
2020-01-08 13:33:29.541 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
...
Try gathering a log with debug_mon 20. That should provide more detail about why AuthMonitor::_assign_global_id() didn't return an ID.
There is nothing in the logs of the two remaining/healthy monitors. What is my best practice to get this host back in the cluster?
peter
_______________________________________________
ceph-users mailing list
mailto:ceph-users@xxxxxxxxxxxxxx
--
Cheers,
Brad
The window for all this was, local time: 9:02 am - disassociated; 11:20 pm - associated. No changes were made, I did reboot the mon02 host at 1 pm. No other network or host issues were observed in the rest of the cluster or at the site.
Thank you for your replies and I'll gather better loggin next time.
peter
| |||||||
| |||||||
| |||||||
| |||||||
| |||||||
|
From: Brad Hubbard <bhubbard@xxxxxxxxxx>
Date: Wednesday, January 8, 2020 at 6:21 PM
To: Peter Eisch <peter.eisch@xxxxxxxxxxxxxxx>
Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: monitor ghosted
On Thu, Jan 9, 2020 at 5:48 AM Peter Eisch <mailto:peter.eisch@xxxxxxxxxxxxxxx> wrote:
Hi,
This morning one of my three monitor hosts got booted from the Nautilus 14.2.4 cluster and it won’t regain. There haven’t been any changes, or events at this site at all. The conf file is the [unchanged] and the same as the other two monitors. The host is also running the MDS and MGR apps without any issue. The ceph-mon log shows this repeating:
2020-01-08 13:33:29.403 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
2020-01-08 13:33:29.433 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
2020-01-08 13:33:29.541 7fec1a736700 1 mon.cephmon02@1(probing) e7 handle_auth_request failed to assign global_id
...
Try gathering a log with debug_mon 20. That should provide more detail about why AuthMonitor::_assign_global_id() didn't return an ID.
There is nothing in the logs of the two remaining/healthy monitors. What is my best practice to get this host back in the cluster?
peter
_______________________________________________
ceph-users mailing list
mailto:ceph-users@xxxxxxxxxxxxxx
--
Cheers,
Brad
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com