On Wed, Jan 10, 2018 at 8:57 AM, Jens-U. Mozdzen <jmozdzen@xxxxxx> wrote: > Dear *, > > has anybody been successful migrating Filestore OSDs to Bluestore OSDs, > keeping the OSD number? There have been a number of messages on the list, > reporting problems, and my experience is the same. (Removing the existing > OSD and creating a new one does work for me.) > > I'm working on an Ceph 12.2.2 cluster and tried following > http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd > - this basically says > > 1. destroy old OSD > 2. zap the disk > 3. prepare the new OSD > 4. activate the new OSD > > I never got step 4 to complete. The closest I got was by doing the following > steps (assuming OSD ID "999" on /dev/sdzz): > > 1. Stop the old OSD via systemd (osd-node # systemctl stop > ceph-osd@999.service) > > 2. umount the old OSD (osd-node # umount /var/lib/ceph/osd/ceph-999) > > 3a. if the old OSD was Bluestore with LVM, manually clean up the old OSD's > volume group > > 3b. zap the block device (osd-node # ceph-volume lvm zap /dev/sdzz) > > 4. destroy the old OSD (osd-node # ceph osd destroy 999 > --yes-i-really-mean-it) > > 5. create a new OSD entry (osd-node # ceph osd new $(cat > /var/lib/ceph/osd/ceph-999/fsid) 999) Step 5 and 6 are problematic if you are going to be trying ceph-volume later on, which takes care of doing this for you. > > 6. add the OSD secret to Ceph authentication (osd-node # ceph auth add > osd.999 mgr 'allow profile osd' osd 'allow *' mon 'allow profile osd' -i > /var/lib/ceph/osd/ceph-999/keyring) > > 7. prepare the new OSD (osd-node # ceph-volume lvm prepare --bluestore > --osd-id 999 --data /dev/sdzz) > mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-999/keyring) You are going to hit a bug in ceph-volume that is preventing you from specifying the osd id directly if the ID has been destroyed. See http://tracker.ceph.com/issues/22642 In order for this to work, you would need to make sure that the ID has really been destroyed and avoid passing --osd-id in ceph-volume. The caveat being that you will get whatever ID is available next in the cluster. > > but ceph-osd keeps complaining "osdmap says I am destroyed, exiting" on > "osd-node # systemctl start ceph-osd@999.service". > > At first I felt I was hitting http://tracker.ceph.com/issues/21023 > (BlueStore-OSDs marked as destroyed in OSD-map after v12.1.1 to v12.1.4 > upgrade). But I was already using the "ceph osd new" command, which didn't > help. > > Some hours of sleep later I matched the issued commands to the osdmap > changes and the ceph-osd log messages, which revealed something strange: > > - from issuing "ceph osd destroy", osdmap lists the OSD as > "autoout,destroyed,exists" (no surprise here) > - once I issued "ceph osd new", osdmap lists the OSD as "autoout,exists,new" > - starting ceph-osd after "ceph osd new" reports "osdmap says I am > destroyed, exiting" > > I can see in the ceph-osd log that it is relating to an *old* osdmap epoch, > roughly 45 minutes old by then? > > This got me curious and I dug through the OSD log file, checking the epoch > numbers during start-up: > > I took some detours, so there's more than two failed starts in the OSD log > file ;) : > > --- cut here --- > # first of multiple attempts, before "ceph auth add ..." > # no actual epoch referenced, as login failed due to missing auth > 2018-01-10 00:00:02.173983 7f5cf1c89d00 0 osd.999 0 crush map has features > 288232575208783872, adjusting msgr requires for clients > 2018-01-10 00:00:02.173990 7f5cf1c89d00 0 osd.999 0 crush map has features > 288232575208783872 was 8705, adjusting msgr requires for mons > 2018-01-10 00:00:02.173994 7f5cf1c89d00 0 osd.999 0 crush map has features > 288232575208783872, adjusting msgr requires for osds > 2018-01-10 00:00:02.174046 7f5cf1c89d00 0 osd.999 0 load_pgs > 2018-01-10 00:00:02.174051 7f5cf1c89d00 0 osd.999 0 load_pgs opened 0 pgs > 2018-01-10 00:00:02.174055 7f5cf1c89d00 0 osd.999 0 using weightedpriority > op queue with priority op cut off at 64. > 2018-01-10 00:00:02.174891 7f5cf1c89d00 -1 osd.999 0 log_to_monitors > {default=true} > 2018-01-10 00:00:02.177479 7f5cf1c89d00 -1 osd.999 0 init authentication > failed: (1) Operation not permitted > > # after "ceph auth ..." > # note the different epochs below? BTW, 110587 is the current epoch at that > time and osd.999 is marked destroyed there > # 109892: much too old to offer any details > # 110587: modified 2018-01-09 23:43:13.202381 > > 2018-01-10 00:08:00.945507 7fc55905bd00 0 osd.999 0 crush map has features > 288232575208783872, adjusting msgr requires for clients > 2018-01-10 00:08:00.945514 7fc55905bd00 0 osd.999 0 crush map has features > 288232575208783872 was 8705, adjusting msgr requires for mons > 2018-01-10 00:08:00.945521 7fc55905bd00 0 osd.999 0 crush map has features > 288232575208783872, adjusting msgr requires for osds > 2018-01-10 00:08:00.945588 7fc55905bd00 0 osd.999 0 load_pgs > 2018-01-10 00:08:00.945594 7fc55905bd00 0 osd.999 0 load_pgs opened 0 pgs > 2018-01-10 00:08:00.945599 7fc55905bd00 0 osd.999 0 using weightedpriority > op queue with priority op cut off at 64. > 2018-01-10 00:08:00.946544 7fc55905bd00 -1 osd.999 0 log_to_monitors > {default=true} > 2018-01-10 00:08:00.951720 7fc55905bd00 0 osd.999 0 done with init, > starting boot process > 2018-01-10 00:08:00.952225 7fc54160a700 -1 osd.999 0 waiting for initial > osdmap > 2018-01-10 00:08:00.970644 7fc546614700 0 osd.999 109892 crush map has > features 288232610642264064, adjusting msgr requires for clients > 2018-01-10 00:08:00.970653 7fc546614700 0 osd.999 109892 crush map has > features 288232610642264064 was 288232575208792577, adjusting msgr requires > for mons > 2018-01-10 00:08:00.970660 7fc546614700 0 osd.999 109892 crush map has > features 1008808551021559808, adjusting msgr requires for osds > 2018-01-10 00:08:01.349602 7fc546614700 -1 osd.999 110587 osdmap says I am > destroyed, exiting > > # another try > # it is now using epoch 110587 for everything. But that one is off by one at > that time already: > # 110587: modified 2018-01-09 23:43:13.202381 > # 110588: modified 2018-01-10 00:12:55.271913 > > # but both 110587 and 110588 have osd.999 as "destroyed", so never mind. > 2018-01-10 00:13:04.332026 7f408d5a4d00 0 osd.999 110587 crush map has > features 288232610642264064, adjusting msgr requires for clients > 2018-01-10 00:13:04.332037 7f408d5a4d00 0 osd.999 110587 crush map has > features 288232610642264064 was 8705, adjusting msgr requires for mons > 2018-01-10 00:13:04.332043 7f408d5a4d00 0 osd.999 110587 crush map has > features 1008808551021559808, adjusting msgr requires for osds > 2018-01-10 00:13:04.332092 7f408d5a4d00 0 osd.999 110587 load_pgs > 2018-01-10 00:13:04.332096 7f408d5a4d00 0 osd.999 110587 load_pgs opened 0 > pgs > 2018-01-10 00:13:04.332100 7f408d5a4d00 0 osd.999 110587 using > weightedpriority op queue with priority op cut off at 64. > 2018-01-10 00:13:04.332990 7f408d5a4d00 -1 osd.999 110587 log_to_monitors > {default=true} > 2018-01-10 00:13:06.026628 7f408d5a4d00 0 osd.999 110587 done with init, > starting boot process > 2018-01-10 00:13:06.027627 7f4075352700 -1 osd.999 110587 osdmap says I am > destroyed, exiting > > # the attempt after using "ceph osd new", which created epoch 110591 as the > first with osd.999 as autoout,exists,new > # But ceph-osd still uses 110587. > # 110587: modified 2018-01-09 23:43:13.202381 > # 110591: modified 2018-01-10 00:30:44.850078 > > 2018-01-10 00:31:15.453871 7f1c57c58d00 0 osd.999 110587 crush map has > features 288232610642264064, adjusting msgr requires for clients > 2018-01-10 00:31:15.453882 7f1c57c58d00 0 osd.999 110587 crush map has > features 288232610642264064 was 8705, adjusting msgr requires for mons > 2018-01-10 00:31:15.453887 7f1c57c58d00 0 osd.999 110587 crush map has > features 1008808551021559808, adjusting msgr requires for osds > 2018-01-10 00:31:15.453940 7f1c57c58d00 0 osd.999 110587 load_pgs > 2018-01-10 00:31:15.453945 7f1c57c58d00 0 osd.999 110587 load_pgs opened 0 > pgs > 2018-01-10 00:31:15.453952 7f1c57c58d00 0 osd.999 110587 using > weightedpriority op queue with priority op cut off at 64. > 2018-01-10 00:31:15.454862 7f1c57c58d00 -1 osd.999 110587 log_to_monitors > {default=true} > 2018-01-10 00:31:15.520533 7f1c57c58d00 0 osd.999 110587 done with init, > starting boot process > 2018-01-10 00:31:15.521278 7f1c40207700 -1 osd.999 110587 osdmap says I am > destroyed, exiting > --- cut here --- > > > So why is ceph-osd referring to an old osdmap, while newer ones are > available for some time already? > > And am I right to believe that *if* ceph-osd had checked the then current > osdmap, it would have started successfully (once I did the "ceph osd new" > that's not mentioned in the docs)? > > Is the documented procedure (from the "master" HTML docs) correct, or should > the "ceph auth" and "ceph osd new" steps get added? > > Regards, > Jens > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com