I have set up a configuration with 3 x MON + 2 x OSD, each on a different host, as a test bench setup. I've written nothing to the cluster (yet). I'm running ceph 0.61.2 (cuttlefish). I want to discover what happens if I move an OSD from one host to another, simulating the effect of moving a working harddrive from a dead host to a live host, which I believe should work. So I stopped osd.0 on one host, and copied (using scp) /var/lib/ceph/osd/ceph-0 from one host to another. My understanding is that starting osd.0 on the destination host with 'service ceph start osd.0' should rewrite the crush map and everything should be fine. In fact what happened was: root@ceph6:~# service ceph start osd.0 === osd.0 === create-or-move updating item id 0 name 'osd.0' weight 0.05 at location {host=ceph6,root=default} to crush map Starting Ceph osd.0 on ceph6... starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal ... root@ceph6:~# ceph health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean; 1/2 in osds are down osd.0 was not running on the new host, due to the abort as set out below (from the log file). Should this work? -- Alex Bligh 2013-05-18 17:03:00.345129 7fa408dbb780 0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-osd, pid 3398 2013-05-18 17:03:00.676611 7fa408dbb780 -1 filestore(/var/lib/ceph/osd/ceph-0) limited size xattrs -- filestore_xattr_use_omap enabled 2013-05-18 17:03:00.891267 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is supported and appears to work 2013-05-18 17:03:00.891314 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option 2013-05-18 17:03:00.891533 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount did NOT detect btrfs 2013-05-18 17:03:01.373741 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount syncfs(2) syscall fully supported (by glibc and kernel) 2013-05-18 17:03:01.374175 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount found snaps <> 2013-05-18 17:03:02.023315 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: btrfs not detected 2013-05-18 17:03:02.024992 7fa408dbb780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-05-18 17:03:02.025372 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 21: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 2013-05-18 17:03:02.025580 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 21: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 2013-05-18 17:03:02.027454 7fa408dbb780 1 journal close /var/lib/ceph/osd/ceph-0/journal 2013-05-18 17:03:02.302070 7fa408dbb780 -1 filestore(/var/lib/ceph/osd/ceph-0) limited size xattrs -- filestore_xattr_use_omap enabled 2013-05-18 17:03:02.361438 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is supported and appears to work 2013-05-18 17:03:02.361508 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option 2013-05-18 17:03:02.361755 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount did NOT detect btrfs 2013-05-18 17:03:02.424915 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount syncfs(2) syscall fully supported (by glibc and kernel) 2013-05-18 17:03:02.425107 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount found snaps <> 2013-05-18 17:03:02.519006 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: btrfs not detected 2013-05-18 17:03:02.520446 7fa408dbb780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2013-05-18 17:03:02.520507 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 29: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 2013-05-18 17:03:02.520625 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 29: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 2013-05-18 17:03:02.522371 7fa408dbb780 0 osd.0 24 crush map has features 33816576, adjusting msgr requires for clients 2013-05-18 17:03:02.522419 7fa408dbb780 0 osd.0 24 crush map has features 33816576, adjusting msgr requires for osds 2013-05-18 17:03:02.533617 7fa408dbb780 -1 *** Caught signal (Aborted) ** in thread 7fa408dbb780 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60) 1: /usr/bin/ceph-osd() [0x79087a] 2: (()+0xfcb0) [0x7fa408254cb0] 3: (gsignal()+0x35) [0x7fa406a0d425] 4: (abort()+0x17b) [0x7fa406a10b8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fa40735f69d] 6: (()+0xb5846) [0x7fa40735d846] 7: (()+0xb5873) [0x7fa40735d873] 8: (()+0xb596e) [0x7fa40735d96e] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x841227] 10: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, ceph::buffer::list*)+0x112) [0x6b6262] 11: (OSD::load_pgs()+0x17f3) [0x63f803] 12: (OSD::init()+0xf9d) [0x641e4d] 13: (main()+0x2351) [0x574831] 14: (__libc_start_main()+0xed) [0x7fa4069f876d] 15: /usr/bin/ceph-osd() [0x576edd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -67> 2013-05-18 17:03:00.227657 7fa408dbb780 5 asok(0x198a000) register_command perfcounters_dump hook 0x197f010 -66> 2013-05-18 17:03:00.227764 7fa408dbb780 5 asok(0x198a000) register_command 1 hook 0x197f010 -65> 2013-05-18 17:03:00.227768 7fa408dbb780 5 asok(0x198a000) register_command perf dump hook 0x197f010 -64> 2013-05-18 17:03:00.228067 7fa408dbb780 5 asok(0x198a000) register_command perfcounters_schema hook 0x197f010 -63> 2013-05-18 17:03:00.228348 7fa408dbb780 5 asok(0x198a000) register_command 2 hook 0x197f010 -62> 2013-05-18 17:03:00.228383 7fa408dbb780 5 asok(0x198a000) register_command perf schema hook 0x197f010 -61> 2013-05-18 17:03:00.228421 7fa408dbb780 5 asok(0x198a000) register_command config show hook 0x197f010 -60> 2013-05-18 17:03:00.228427 7fa408dbb780 5 asok(0x198a000) register_command config set hook 0x197f010 -59> 2013-05-18 17:03:00.228456 7fa408dbb780 5 asok(0x198a000) register_command log flush hook 0x197f010 -58> 2013-05-18 17:03:00.228460 7fa408dbb780 5 asok(0x198a000) register_command log dump hook 0x197f010 -57> 2013-05-18 17:03:00.228462 7fa408dbb780 5 asok(0x198a000) register_command log reopen hook 0x197f010 -56> 2013-05-18 17:03:00.345129 7fa408dbb780 0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-osd, pid 3398 -55> 2013-05-18 17:03:00.351826 7fa408dbb780 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6800/3398 need_addr=1 -54> 2013-05-18 17:03:00.351869 7fa408dbb780 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6801/3398 need_addr=1 -53> 2013-05-18 17:03:00.351881 7fa408dbb780 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6802/3398 need_addr=1 -52> 2013-05-18 17:03:00.352944 7fa408dbb780 1 finished global_init_daemonize -51> 2013-05-18 17:03:00.356999 7fa408dbb780 5 asok(0x198a000) init /var/run/ceph/ceph-osd.0.asok -50> 2013-05-18 17:03:00.357073 7fa408dbb780 5 asok(0x198a000) bind_and_listen /var/run/ceph/ceph-osd.0.asok -49> 2013-05-18 17:03:00.357111 7fa408dbb780 5 asok(0x198a000) register_command 0 hook 0x197e0b0 -48> 2013-05-18 17:03:00.357118 7fa408dbb780 5 asok(0x198a000) register_command version hook 0x197e0b0 -47> 2013-05-18 17:03:00.357124 7fa408dbb780 5 asok(0x198a000) register_command git_version hook 0x197e0b0 -46> 2013-05-18 17:03:00.357128 7fa408dbb780 5 asok(0x198a000) register_command help hook 0x197f0d0 -45> 2013-05-18 17:03:00.357218 7fa404a42700 5 asok(0x198a000) entry start -44> 2013-05-18 17:03:00.676611 7fa408dbb780 -1 filestore(/var/lib/ceph/osd/ceph-0) limited size xattrs -- filestore_xattr_use_omap enabled -43> 2013-05-18 17:03:00.891267 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is supported and appears to work -42> 2013-05-18 17:03:00.891314 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option -41> 2013-05-18 17:03:00.891533 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount did NOT detect btrfs -40> 2013-05-18 17:03:01.373741 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount syncfs(2) syscall fully supported (by glibc and kernel) -39> 2013-05-18 17:03:01.374175 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount found snaps <> -38> 2013-05-18 17:03:02.023315 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: btrfs not detected -37> 2013-05-18 17:03:02.024912 7fa408dbb780 2 journal open /var/lib/ceph/osd/ceph-0/journal fsid af7f76fd-ccf6-42c2-99ac-0afbe255b1ef fs_op_seq 2320 -36> 2013-05-18 17:03:02.024992 7fa408dbb780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway -35> 2013-05-18 17:03:02.025372 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 21: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 -34> 2013-05-18 17:03:02.025503 7fa408dbb780 2 journal read_entry 9531392 : seq 2320 479 bytes -33> 2013-05-18 17:03:02.025525 7fa408dbb780 2 journal No further valid entries found, journal is most likely valid -32> 2013-05-18 17:03:02.025541 7fa408dbb780 2 journal No further valid entries found, journal is most likely valid -31> 2013-05-18 17:03:02.025550 7fa408dbb780 3 journal journal_replay: end of journal, done. -30> 2013-05-18 17:03:02.025580 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 21: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 -29> 2013-05-18 17:03:02.027063 7fa401a3c700 1 FileStore::op_tp worker finish -28> 2013-05-18 17:03:02.027141 7fa40223d700 1 FileStore::op_tp worker finish -27> 2013-05-18 17:03:02.027454 7fa408dbb780 1 journal close /var/lib/ceph/osd/ceph-0/journal -26> 2013-05-18 17:03:02.028543 7fa408dbb780 10 monclient(hunting): build_initial_monmap -25> 2013-05-18 17:03:02.073714 7fa408dbb780 5 adding auth protocol: cephx -24> 2013-05-18 17:03:02.073741 7fa408dbb780 5 adding auth protocol: cephx -23> 2013-05-18 17:03:02.074176 7fa408dbb780 1 -- 0.0.0.0:6800/3398 messenger.start -22> 2013-05-18 17:03:02.074296 7fa408dbb780 1 -- :/0 messenger.start -21> 2013-05-18 17:03:02.074343 7fa408dbb780 1 -- 0.0.0.0:6802/3398 messenger.start -20> 2013-05-18 17:03:02.074424 7fa408dbb780 1 -- 0.0.0.0:6801/3398 messenger.start -19> 2013-05-18 17:03:02.074997 7fa408dbb780 2 osd.0 0 mounting /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal -18> 2013-05-18 17:03:02.302070 7fa408dbb780 -1 filestore(/var/lib/ceph/osd/ceph-0) limited size xattrs -- filestore_xattr_use_omap enabled -17> 2013-05-18 17:03:02.361438 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is supported and appears to work -16> 2013-05-18 17:03:02.361508 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option -15> 2013-05-18 17:03:02.361755 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount did NOT detect btrfs -14> 2013-05-18 17:03:02.424915 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount syncfs(2) syscall fully supported (by glibc and kernel) -13> 2013-05-18 17:03:02.425107 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount found snaps <> -12> 2013-05-18 17:03:02.519006 7fa408dbb780 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: btrfs not detected -11> 2013-05-18 17:03:02.520383 7fa408dbb780 2 journal open /var/lib/ceph/osd/ceph-0/journal fsid af7f76fd-ccf6-42c2-99ac-0afbe255b1ef fs_op_seq 2320 -10> 2013-05-18 17:03:02.520446 7fa408dbb780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway -9> 2013-05-18 17:03:02.520507 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 29: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 -8> 2013-05-18 17:03:02.520565 7fa408dbb780 2 journal read_entry 9531392 : seq 2320 479 bytes -7> 2013-05-18 17:03:02.520583 7fa408dbb780 2 journal No further valid entries found, journal is most likely valid -6> 2013-05-18 17:03:02.520596 7fa408dbb780 2 journal No further valid entries found, journal is most likely valid -5> 2013-05-18 17:03:02.520604 7fa408dbb780 3 journal journal_replay: end of journal, done. -4> 2013-05-18 17:03:02.520625 7fa408dbb780 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 29: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0 -3> 2013-05-18 17:03:02.521575 7fa408dbb780 2 osd.0 0 boot -2> 2013-05-18 17:03:02.522371 7fa408dbb780 0 osd.0 24 crush map has features 33816576, adjusting msgr requires for clients -1> 2013-05-18 17:03:02.522419 7fa408dbb780 0 osd.0 24 crush map has features 33816576, adjusting msgr requires for osds 0> 2013-05-18 17:03:02.533617 7fa408dbb780 -1 *** Caught signal (Aborted) ** in thread 7fa408dbb780 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60) 1: /usr/bin/ceph-osd() [0x79087a] 2: (()+0xfcb0) [0x7fa408254cb0] 3: (gsignal()+0x35) [0x7fa406a0d425] 4: (abort()+0x17b) [0x7fa406a10b8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fa40735f69d] 6: (()+0xb5846) [0x7fa40735d846] 7: (()+0xb5873) [0x7fa40735d873] 8: (()+0xb596e) [0x7fa40735d96e] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x841227] 10: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, ceph::buffer::list*)+0x112) [0x6b6262] 11: (OSD::load_pgs()+0x17f3) [0x63f803] 12: (OSD::init()+0xf9d) [0x641e4d] 13: (main()+0x2351) [0x574831] 14: (__libc_start_main()+0xed) [0x7fa4069f876d] 15: /usr/bin/ceph-osd() [0x576edd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 0/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 hadoop 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.0.log --- end dump of recent events --- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com