On Tue, 14 Jan 2014, GuangYang wrote: > Hi ceph-users and ceph-devel, > I came across an issue after restarting monitors of the cluster, that authentication fails which prevents running any ceph command. > > After we did some maintenance work, I restart OSD, however, I found that the OSD would not join the cluster automatically after being restarted, though TCP dump showed it had already sent messenger to monitor telling add me into the cluster. > > So that I suspected there might be some issues of monitor and I restarted monitor one by one (3 in total), however, after restarting monitors, all ceph command would fail saying authentication timeout? > > 2014-01-14 12:00:30.499397 7fc7f195e700 0 monclient(hunting): authenticate timed out after 300 > 2014-01-14 12:00:30.499440 7fc7f195e700 0 librados: client.admin authentication error (110) Connection timed out > Error connecting to cluster: Error > > Any idea why such error happened (restarting OSD would result in the same error)? > > I am thinking the authentication information is persisted in mon local disk and is there a chance those data got corrupted? That sounds unlikely, but you're right that the core problem is with the mons. What does ceph daemon mon.`hostname` mon_status say? Perhaps they are not forming a quorum and that is what is preventing authentication. sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com