Dear *,
has anybody been successful migrating Filestore OSDs to Bluestore
OSDs, keeping the OSD number? There have been a number of messages on
the list, reporting problems, and my experience is the same. (Removing
the existing OSD and creating a new one does work for me.)
I'm working on an Ceph 12.2.2 cluster and tried following
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd - this basically
says
1. destroy old OSD
2. zap the disk
3. prepare the new OSD
4. activate the new OSD
I never got step 4 to complete. The closest I got was by doing the
following steps (assuming OSD ID "999" on /dev/sdzz):
1. Stop the old OSD via systemd (osd-node # systemctl stop
ceph-osd@999.service)
2. umount the old OSD (osd-node # umount /var/lib/ceph/osd/ceph-999)
3a. if the old OSD was Bluestore with LVM, manually clean up the old
OSD's volume group
3b. zap the block device (osd-node # ceph-volume lvm zap /dev/sdzz)
4. destroy the old OSD (osd-node # ceph osd destroy 999
--yes-i-really-mean-it)
5. create a new OSD entry (osd-node # ceph osd new $(cat
/var/lib/ceph/osd/ceph-999/fsid) 999)
6. add the OSD secret to Ceph authentication (osd-node # ceph auth add
osd.999 mgr 'allow profile osd' osd 'allow *' mon 'allow profile osd'
-i /var/lib/ceph/osd/ceph-999/keyring)
7. prepare the new OSD (osd-node # ceph-volume lvm prepare --bluestore
--osd-id 999 --data /dev/sdzz)
mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-999/keyring)
but ceph-osd keeps complaining "osdmap says I am destroyed, exiting"
on "osd-node # systemctl start ceph-osd@999.service".
At first I felt I was hitting http://tracker.ceph.com/issues/21023
(BlueStore-OSDs marked as destroyed in OSD-map after v12.1.1 to
v12.1.4 upgrade). But I was already using the "ceph osd new" command,
which didn't help.
Some hours of sleep later I matched the issued commands to the osdmap
changes and the ceph-osd log messages, which revealed something strange:
- from issuing "ceph osd destroy", osdmap lists the OSD as
"autoout,destroyed,exists" (no surprise here)
- once I issued "ceph osd new", osdmap lists the OSD as "autoout,exists,new"
- starting ceph-osd after "ceph osd new" reports "osdmap says I am
destroyed, exiting"
I can see in the ceph-osd log that it is relating to an *old* osdmap
epoch, roughly 45 minutes old by then?
This got me curious and I dug through the OSD log file, checking the
epoch numbers during start-up:
I took some detours, so there's more than two failed starts in the OSD
log file ;) :
--- cut here ---
# first of multiple attempts, before "ceph auth add ..."
# no actual epoch referenced, as login failed due to missing auth
2018-01-10 00:00:02.173983 7f5cf1c89d00 0 osd.999 0 crush map has
features 288232575208783872, adjusting msgr requires for clients
2018-01-10 00:00:02.173990 7f5cf1c89d00 0 osd.999 0 crush map has
features 288232575208783872 was 8705, adjusting msgr requires for mons
2018-01-10 00:00:02.173994 7f5cf1c89d00 0 osd.999 0 crush map has
features 288232575208783872, adjusting msgr requires for osds
2018-01-10 00:00:02.174046 7f5cf1c89d00 0 osd.999 0 load_pgs
2018-01-10 00:00:02.174051 7f5cf1c89d00 0 osd.999 0 load_pgs opened 0 pgs
2018-01-10 00:00:02.174055 7f5cf1c89d00 0 osd.999 0 using
weightedpriority op queue with priority op cut off at 64.
2018-01-10 00:00:02.174891 7f5cf1c89d00 -1 osd.999 0 log_to_monitors
{default=true}
2018-01-10 00:00:02.177479 7f5cf1c89d00 -1 osd.999 0 init
authentication failed: (1) Operation not permitted
# after "ceph auth ..."
# note the different epochs below? BTW, 110587 is the current epoch at
that time and osd.999 is marked destroyed there
# 109892: much too old to offer any details
# 110587: modified 2018-01-09 23:43:13.202381
2018-01-10 00:08:00.945507 7fc55905bd00 0 osd.999 0 crush map has
features 288232575208783872, adjusting msgr requires for clients
2018-01-10 00:08:00.945514 7fc55905bd00 0 osd.999 0 crush map has
features 288232575208783872 was 8705, adjusting msgr requires for mons
2018-01-10 00:08:00.945521 7fc55905bd00 0 osd.999 0 crush map has
features 288232575208783872, adjusting msgr requires for osds
2018-01-10 00:08:00.945588 7fc55905bd00 0 osd.999 0 load_pgs
2018-01-10 00:08:00.945594 7fc55905bd00 0 osd.999 0 load_pgs opened 0 pgs
2018-01-10 00:08:00.945599 7fc55905bd00 0 osd.999 0 using
weightedpriority op queue with priority op cut off at 64.
2018-01-10 00:08:00.946544 7fc55905bd00 -1 osd.999 0 log_to_monitors
{default=true}
2018-01-10 00:08:00.951720 7fc55905bd00 0 osd.999 0 done with init,
starting boot process
2018-01-10 00:08:00.952225 7fc54160a700 -1 osd.999 0 waiting for
initial osdmap
2018-01-10 00:08:00.970644 7fc546614700 0 osd.999 109892 crush map
has features 288232610642264064, adjusting msgr requires for clients
2018-01-10 00:08:00.970653 7fc546614700 0 osd.999 109892 crush map
has features 288232610642264064 was 288232575208792577, adjusting msgr
requires for mons
2018-01-10 00:08:00.970660 7fc546614700 0 osd.999 109892 crush map
has features 1008808551021559808, adjusting msgr requires for osds
2018-01-10 00:08:01.349602 7fc546614700 -1 osd.999 110587 osdmap says
I am destroyed, exiting
# another try
# it is now using epoch 110587 for everything. But that one is off by
one at that time already:
# 110587: modified 2018-01-09 23:43:13.202381
# 110588: modified 2018-01-10 00:12:55.271913
# but both 110587 and 110588 have osd.999 as "destroyed", so never mind.
2018-01-10 00:13:04.332026 7f408d5a4d00 0 osd.999 110587 crush map
has features 288232610642264064, adjusting msgr requires for clients
2018-01-10 00:13:04.332037 7f408d5a4d00 0 osd.999 110587 crush map
has features 288232610642264064 was 8705, adjusting msgr requires for
mons
2018-01-10 00:13:04.332043 7f408d5a4d00 0 osd.999 110587 crush map
has features 1008808551021559808, adjusting msgr requires for osds
2018-01-10 00:13:04.332092 7f408d5a4d00 0 osd.999 110587 load_pgs
2018-01-10 00:13:04.332096 7f408d5a4d00 0 osd.999 110587 load_pgs
opened 0 pgs
2018-01-10 00:13:04.332100 7f408d5a4d00 0 osd.999 110587 using
weightedpriority op queue with priority op cut off at 64.
2018-01-10 00:13:04.332990 7f408d5a4d00 -1 osd.999 110587
log_to_monitors {default=true}
2018-01-10 00:13:06.026628 7f408d5a4d00 0 osd.999 110587 done with
init, starting boot process
2018-01-10 00:13:06.027627 7f4075352700 -1 osd.999 110587 osdmap says
I am destroyed, exiting
# the attempt after using "ceph osd new", which created epoch 110591
as the first with osd.999 as autoout,exists,new
# But ceph-osd still uses 110587.
# 110587: modified 2018-01-09 23:43:13.202381
# 110591: modified 2018-01-10 00:30:44.850078
2018-01-10 00:31:15.453871 7f1c57c58d00 0 osd.999 110587 crush map
has features 288232610642264064, adjusting msgr requires for clients
2018-01-10 00:31:15.453882 7f1c57c58d00 0 osd.999 110587 crush map
has features 288232610642264064 was 8705, adjusting msgr requires for
mons
2018-01-10 00:31:15.453887 7f1c57c58d00 0 osd.999 110587 crush map
has features 1008808551021559808, adjusting msgr requires for osds
2018-01-10 00:31:15.453940 7f1c57c58d00 0 osd.999 110587 load_pgs
2018-01-10 00:31:15.453945 7f1c57c58d00 0 osd.999 110587 load_pgs
opened 0 pgs
2018-01-10 00:31:15.453952 7f1c57c58d00 0 osd.999 110587 using
weightedpriority op queue with priority op cut off at 64.
2018-01-10 00:31:15.454862 7f1c57c58d00 -1 osd.999 110587
log_to_monitors {default=true}
2018-01-10 00:31:15.520533 7f1c57c58d00 0 osd.999 110587 done with
init, starting boot process
2018-01-10 00:31:15.521278 7f1c40207700 -1 osd.999 110587 osdmap says
I am destroyed, exiting
--- cut here ---
So why is ceph-osd referring to an old osdmap, while newer ones are
available for some time already?
And am I right to believe that *if* ceph-osd had checked the then
current osdmap, it would have started successfully (once I did the
"ceph osd new" that's not mentioned in the docs)?
Is the documented procedure (from the "master" HTML docs) correct, or
should the "ceph auth" and "ceph osd new" steps get added?
Regards,
Jens
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com