Hi, On a freshly created 4 node cluster I'm struggling to get the 4th node to create correctly. ceph-deploy is unable to create the OSDs on it and when logging in to the node and attempting to run `ceph -s` manually (after copying the client.admin keyring) with debug parameters it ends up hanging and looping over mon_command({"prefix": "get_command_descriptions"} v 0). I'm not sure what else to try to find out why this is happening. It seems like it's able to talk to the monitors okay as it looks like it is authenticating, and the same command runs fine on the first 3 nodes which are running monitors, but just hangs on the node that isn't. Thanks in advance for any help! -------------- next part -------------- root at ceph4:~# ceph -s --debug-ms=5 --debug-client=5 --debug-mon=10 2014-08-21 14:45:32.689379 7ff622841700 1 -- :/0 messenger.start 2014-08-21 14:45:32.691284 7ff622841700 1 -- :/1007607 --> 192.168.78.13:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7ff61c024980 con 0x7ff61c024530 2014-08-21 14:45:32.692075 7ff61a7fc700 1 -- 192.168.78.14:0/1007607 learned my addr 192.168.78.14:0/1007607 2014-08-21 14:45:32.693174 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 1 ==== mon_map v1 ==== 485+0+0 (2066881705 0 0) 0x7ff610000bd0 con 0x7ff61c024530 2014-08-21 14:45:32.693383 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (3596119886 0 0) 0x7ff610001080 con 0x7ff61c024530 2014-08-21 14:45:32.693691 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- ?+0 0x7ff604001680 con 0x7ff61c024530 2014-08-21 14:45:32.694549 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (1790499909 0 0) 0x7ff610001080 con 0x7ff61c024530 2014-08-21 14:45:32.694750 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- ?+0 0x7ff604003810 con 0x7ff61c024530 2014-08-21 14:45:32.695641 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 393+0+0 (350251809 0 0) 0x7ff6100008c0 con 0x7ff61c024530 2014-08-21 14:45:32.695780 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7ff61c020c20 con 0x7ff61c024530 2014-08-21 14:45:32.696051 7ff622841700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7ff61c025200 con 0x7ff61c024530 2014-08-21 14:45:32.696079 7ff622841700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7ff61c0257a0 con 0x7ff61c024530 2014-08-21 14:45:32.696324 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 5 ==== mon_map v1 ==== 485+0+0 (2066881705 0 0) 0x7ff6100012f0 con 0x7ff61c024530 2014-08-21 14:45:32.696422 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1427523647 0 0) 0x7ff610001590 con 0x7ff61c024530 2014-08-21 14:45:32.696834 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 7 ==== osd_map(46..46 src has 1..46) v3 ==== 7172+0+0 (2083907578 0 0) 0x7ff6100008c0 con 0x7ff61c024530 2014-08-21 14:45:32.697095 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 8 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1427523647 0 0) 0x7ff610002fd0 con 0x7ff61c024530 2014-08-21 14:45:32.704621 7ff622841700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_command({"prefix": "get_command_descriptions"} v 0) v1 -- ?+0 0x7ff61c025c10 con 0x7ff61c024530 2014-08-21 14:45:32.900195 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 9 ==== osd_map(46..46 src has 1..46) v3 ==== 7172+0+0 (2083907578 0 0) 0x7ff6100008c0 con 0x7ff61c024530 2014-08-21 14:45:32.900265 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 10 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1427523647 0 0) 0x7ff610002fd0 con 0x7ff61c024530 2014-08-21 14:46:05.691726 7ff61b7fe700 1 -- 192.168.78.14:0/1007607 mark_down 0x7ff61c024530 -- 0x7ff61c0242c0 2014-08-21 14:46:05.691818 7ff61a6fb700 2 -- 192.168.78.14:0/1007607 >> 192.168.78.13:6789/0 pipe(0x7ff61c0242c0 sd=3 :60918 s=4 pgs=174 cs=1 l=1 c=0x7ff61c024530).fault (0) Success 2014-08-21 14:46:05.691913 7ff61b7fe700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.12:6789/0 -- auth(proto 0 30 bytes epoch 1) v1 -- ?+0 0x7ff608001ba0 con 0x7ff608001760 2014-08-21 14:46:05.693707 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.1 192.168.78.12:6789/0 1 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (2330663482 0 0) 0x7ff610001220 con 0x7ff608001760 2014-08-21 14:46:05.693982 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.12:6789/0 -- auth(proto 2 128 bytes epoch 0) v1 -- ?+0 0x7ff604007520 con 0x7ff608001760 2014-08-21 14:46:05.694986 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.1 192.168.78.12:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 225+0+0 (187198672 0 0) 0x7ff610001220 con 0x7ff608001760 2014-08-21 14:46:05.695220 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.12:6789/0 -- mon_subscribe({monmap=2+}) v2 -- ?+0 0x7ff608003160 con 0x7ff608001760 2014-08-21 14:46:05.695257 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.12:6789/0 -- mon_command({"prefix": "get_command_descriptions"} v 0) v1 -- ?+0 0x7ff6040079e0 con 0x7ff608001760 2014-08-21 14:46:05.695936 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.1 192.168.78.12:6789/0 3 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1427523647 0 0) 0x7ff6100009b0 con 0x7ff608001760 2014-08-21 14:46:41.692573 7ff61b7fe700 1 -- 192.168.78.14:0/1007607 mark_down 0x7ff608001760 -- 0x7ff6080014f0 2014-08-21 14:46:41.692669 7ff61a6fb700 2 -- 192.168.78.14:0/1007607 >> 192.168.78.12:6789/0 pipe(0x7ff6080014f0 sd=4 :43412 s=4 pgs=230 cs=1 l=1 c=0x7ff608001760).fault (0) Success 2014-08-21 14:46:41.692748 7ff61b7fe700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- auth(proto 0 30 bytes epoch 1) v1 -- ?+0 0x7ff608002bd0 con 0x7ff6080029a0 2014-08-21 14:46:42.693001 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 1 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (295781237 0 0) 0x7ff610000970 con 0x7ff6080029a0 2014-08-21 14:46:42.693262 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- auth(proto 2 128 bytes epoch 0) v1 -- ?+0 0x7ff604008050 con 0x7ff6080029a0 2014-08-21 14:46:42.694361 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 225+0+0 (2163676846 0 0) 0x7ff6100013a0 con 0x7ff6080029a0 2014-08-21 14:46:42.694533 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_subscribe({monmap=2+}) v2 -- ?+0 0x7ff6080035f0 con 0x7ff6080029a0 2014-08-21 14:46:42.694565 7ff620885700 1 -- 192.168.78.14:0/1007607 --> 192.168.78.13:6789/0 -- mon_command({"prefix": "get_command_descriptions"} v 0) v1 -- ?+0 0x7ff6040085b0 con 0x7ff6080029a0 2014-08-21 14:46:42.695297 7ff620885700 1 -- 192.168.78.14:0/1007607 <== mon.2 192.168.78.13:6789/0 3 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1427523647 0 0) 0x7ff6100009c0 con 0x7ff6080029a0