It’s odd, the cluster is seems to be working somewhat. Cant bring down OSDs online, but the un-restarted nodes still work. Ceph –w hangs ceph --admin-daemon /var/run/ceph/ceph-mon.FOO.asok mon_status hangs and nothing in /var/log/ceph/* Mon03 output ==== 57+0+0 (2052948678 0 0) 0x3fa7240 con 0x42c7580 2013-06-10 15:56:17.781158 7f91279d3700 20 mon.3@2(probing) e1 have connection 2013-06-10 15:56:17.781163 7f91279d3700 5 mon.3@2(probing) e1 waitlisting message auth(proto 0 27 bytes epoch 1) v1 2013-06-10 15:56:17.923463 7f91279d3700 10 mon.3@2(probing) e1 ms_handle_reset 0x51429a0 10.198.141.36:6801/8552 2013-06-10 15:56:17.962012 7f9124db5700 1 -- 10.198.141.203:6789/0 >> :/0 pipe(0x31d3780 sd=45 :6789 s=0 pgs=0 cs=0 l=0).accept sd=45 10.198.141.39:54206/0 2013-06-10 15:56:17.962212 7f9124db5700 10 mon.3@2(probing) e1 ms_verify_authorizer 10.198.141.39:6801/7360 osd protocol 0 2013-06-10 15:56:17.962584 7f91279d3700 1 -- 10.198.141.203:6789/0 <== osd.73 10.198.141.39:6801/7360 1 ==== auth(proto 0 27 bytes epoch 1) v1 ==== 57+0+0 (2243780706 0 0) 0x3fa7480 con 0x42c79a0 2013-06-10 15:56:17.962609 7f91279d3700 20 mon.3@2(probing) e1 have connection Mon2 output (lots of the same) 2013-06-10 15:56:56.501807 7fb506fc2700 1 mon.2@1(electing) e1 discarding message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere 2013-06-10 15:56:56.501826 7fb506fc2700 1 mon.2@1(electing) e1 discarding message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere 2013-06-10 15:56:56.501847 7fb506fc2700 1 mon.2@1(electing) e1 discarding message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere 2013-06-10 15:56:56.501865 7fb506fc2700 1 mon.2@1(electing) e1 discarding message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere 2013-06-10 15:56:56.561414 7fb506fc2700 0 log [INF] : mon.2@1 won leader election with quorum 1,2 Mon01 output (lots of the same) 2013-06-10 15:56:29.456421 7fb8de0b2700 10 mon.1@0(synchronizing sync( requester state start )) e1 ms_verify_authorizer 10.198.141.32:6800/16748 osd protocol 0 2013-06-10 15:56:29.585180 7fb8ddfb1700 1 -- 10.198.141.201:6789/0 >> :/0 pipe(0x315ba00 sd=259 :6789 s=0 pgs=0 cs=0 l=0).accept sd=259 10.198.141.35:51166/0 2013-06-10 15:56:29.585483 7fb8ddfb1700 10 mon.1@0(synchronizing sync( requester state start )) e1 ms_verify_authorizer 10.198.141.35:6801/9214 osd protocol 0 2013-06-10 15:56:29.658574 7fb8dd3a5700 1 -- 10.198.141.201:6789/0 >> :/0 pipe(0x3198280 sd=747 :6789 s=0 pgs=0 cs=0 l=0).accept sd=747 10.198.141.32:49135/0 2013-06-10 15:56:29.658867 7fb8dd3a5700 10 mon.1@0(synchronizing sync( requester state start )) e1 ms_verify_authorizer 10.198.141.32:6801/17221 osd protocol 0 2013-06-10 15:56:29.787631 7fb8dd0a2700 1 -- 10.198.141.201:6789/0 >> :/0 pipe(0x3198a00 sd=361 :6789 s=0 pgs=0 cs=0 l=0).accept sd=361 10.198.141.32:49136/0 2013-06-10 15:56:29.787893 7fb8dd0a2700 10 mon.1@0(synchronizing sync( requester state start )) e1 ms_verify_authorizer 10.198.141.32:6803/18346 osd protocol 0 2013-06-10 15:56:30.025106 7fb8e02d4700 1 -- 10.198.141.201:6789/0 >> :/0 pipe(0x3198000 sd=556 :6789 s=0 pgs=0 cs=0 l=0).accept sd=556 10.198.141.25:40773/0 2013-06-10 15:56:30.025391 7fb8e02d4700 10 mon.1@0(synchronizing sync( requester state start )) e1 ms_verify_authorizer 10.198.141.25:6801/12417 osd protocol 0 Nelson Jeppesen Disney Technology Solutions and Services Phone 206-588-5001 |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com