I still have bad the luck to figure out what is the problem making authentication failure, so in order to get the cluster back, I tried:
1. stop all daemons (mon & osd)
2. change the configuration to disable cephx
3. start mon daemons (3 in total)
4. start osd daemon one by one
After finishing step 3, the cluster can be reachable ('ceph -s' give results):
-bash-4.1$ sudo ceph -s
cluster b9cb3ea9-e1de-48b4-9e86-6921e2c537d2
health HEALTH_WARN 2797 pgs degraded; 107 pgs down; 7503 pgs peering; 917 pgs recovering; 6079 pgs recovery_wait; 2957 pgs stale; 7771 pgs stuck inactive; 2957 pgs stuck stale; 16567 pgs stuck unclean; recovery 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%); 2 near full osd(s); 57/751 in osds are down; noout,nobackfill,norecover,noscrub,nodeep-scrub flag(s) set
monmap e1: 3 mons at {osd151=10.194.0.68:6789/0,osd152=10.193.207.130:6789/0,osd153=10.193.207.131:6789/0}, election epoch 106022, quorum 0,1,2 osd151,osd152,osd153
osdmap e134893: 781 osds: 694 up, 751 in
pgmap v2388518: 22203 pgs: 26 inactive, 14 active, 79 stale+active+recovering, 5020 active+clean, 242 stale, 4352 active+recovery_wait, 616 stale+active+clean, 177 active+recovering+degraded, 6714 peering, 925 stale+active+recovery_wait, 86 down+peering, 1547 active+degraded, 32 stale+active+recovering+degraded, 648 stale+peering, 21 stale+down+peering, 239 stale+active+degraded, 651 active+recovery_wait+degraded, 30 remapped+peering, 151 stale+active+recovery_wait+degraded, 4 stale+remapped+peering, 629 active+recovering; 79656 GB data, 363 TB used, 697 TB / 1061 TB avail; 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%)
mdsmap e1: 0/0/1 up
(at this point, all OSDs should be down).
When I tried to start OSD daemon, the starting script got hang, and the process hang is:
root 80497 80496 0 08:18 pts/0 00:00:00 python /usr/bin/ceph --name=osd.22 --keyring=/var/lib/ceph/osd/ceph-22/keyring osd crush create-or-move -- 22 0.40 root=default host=osd173
When I strace the starting script, I got the following traces (process 75873 is the above process), it failed with futex and then do a infinite loop:
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
Any idea what might trigger this?
======= STRACE (PARTIAL) ==========
[pid 75873] futex(0xf707a0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 75878] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f5da6529000
[pid 75878] munmap(0x7f5da6529000, 28143616) = 0
[pid 75878] munmap(0x7f5dac000000, 38965248) = 0
[pid 75878] mprotect(0x7f5da8000000, 135168, PROT_READ|PROT_WRITE) = 0
[pid 75878] futex(0xf707a0, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 75873] <... futex resumed> ) = 0
[pid 75873] futex(0xdd3cb0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 75878] futex(0xdd3cb0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 75873] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 75878] <... futex resumed> ) = 0
[pid 75873] select(0, NULL, NULL, NULL, {0, 1000} <unfinished ...>
[pid 75878] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid 75878] mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f5dadb28000
[pid 75878] mprotect(0x7f5dadb28000, 4096, PROT_NONE) = 0
[ omit some entries?]
[pid 75873] select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
[pid 75873] select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
Thanks,
Guang
On Jan 15, 2014, at 5:54 AM, Guang <yguang11@xxxxxxxxxxx> wrote:
Thanks Sage.
-bash-4.1$ sudo ceph --admin-daemon /var/run/ceph/ceph-mon.osd151.asok mon_status
{ "name": "osd151",
"rank": 2,
"state": "electing",
"election_epoch": 85469,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "b9cb3ea9-e1de-48b4-9e86-6921e2c537d2",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "osd152",
"addr": "10.193.207.130:6789\/0"},
{ "rank": 1,
"name": "osd153",
"addr": "10.193.207.131:6789\/0"},
{ "rank": 2,
"name": "osd151",
"addr": "10.194.0.68:6789\/0"}]}}
And:
-bash-4.1$ sudo ceph --admin-daemon /var/run/ceph/ceph-mon.osd151.asok quorum_status
{ "election_epoch": 85480,
"quorum": [
0,
1,
2],
"quorum_names": [
"osd151",
"osd152",
"osd153"],
"quorum_leader_name": "osd152",
"monmap": { "epoch": 1,
"fsid": "b9cb3ea9-e1de-48b4-9e86-6921e2c537d2",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "osd152",
"addr": "10.193.207.130:6789\/0"},
{ "rank": 1,
"name": "osd153",
"addr": "10.193.207.131:6789\/0"},
{ "rank": 2,
"name": "osd151",
"addr": "10.194.0.68:6789\/0"}]}}
The election has been finished with leader selected from the above status.
Thanks,
Guang
On Jan 14, 2014, at 10:55 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
On Tue, 14 Jan 2014, GuangYang wrote:
Hi ceph-users and ceph-devel,
I came across an issue after restarting monitors of the cluster, that authentication fails which prevents running any ceph command.
After we did some maintenance work, I restart OSD, however, I found that the OSD would not join the cluster automatically after being restarted, though TCP dump showed it had already sent messenger to monitor telling add me into the cluster.
So that I suspected there might be some issues of monitor and I restarted monitor one by one (3 in total), however, after restarting monitors, all ceph command would fail saying authentication timeout?
2014-01-14 12:00:30.499397 7fc7f195e700 0 monclient(hunting): authenticate timed out after 300
2014-01-14 12:00:30.499440 7fc7f195e700 0 librados: client.admin authentication error (110) Connection timed out
Error connecting to cluster: Error
Any idea why such error happened (restarting OSD would result in the same error)?
I am thinking the authentication information is persisted in mon local disk and is there a chance those data got corrupted?
That sounds unlikely, but you're right that the core problem is with the
mons. What does
ceph daemon mon.`hostname` mon_status
say? Perhaps they are not forming a quorum and that is what is preventing
authentication.
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html