On 01/26/2018 06:37 PM, David Majchrzak wrote:
Ran:
ceph auth del osd.0
ceph auth del osd.6
ceph auth del osd.7
ceph osd rm osd.0
ceph osd rm osd.6
ceph osd rm osd.7
which seems to have removed them.
Did you destroy the OSD prior to running ceph-volume?
$ ceph osd destroy 6
After you've done that you can use ceph-volume to re-create the OSD.
Wido
Thanks for the help Reed!
Kind Regards,
David Majchrzak
26 jan. 2018 kl. 18:32 skrev David Majchrzak <david@xxxxxxxxxx
<mailto:david@xxxxxxxxxx>>:
Thanks that helped!
Since I had already "halfway" created a lvm volume I wanted to start
from the beginning and zap it.
Tried to zap the raw device but failed since --destroy doesn't seem to
be in 12.2.2
http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
root@int1:~# ceph-volume lvm zap /dev/sdc --destroy
usage: ceph-volume lvm zap [-h] [DEVICE]
ceph-volume lvm zap: error: unrecognized arguments: --destroy
So i zapped it with the vg/lvm instead.
ceph-volume lvm zap
/dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
However I run create on it since the LVM was already there.
So I zapped it with sgdisk and ran dmsetup remove. After that I was
able to create it again.
However - each "ceph-volume lvm create" that I ran that failed,
successfully added an osd to crush map ;)
So I've got this now:
root@int1:~# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 2.60959 - 2672G 1101G 1570G 41.24 1.00 - root default
-2 0.87320 - 894G 369G 524G 41.36 1.00 - host int1
3 ssd 0.43660 1.00000 447G 358G 90295M 80.27 1.95 301 osd.3
8 ssd 0.43660 1.00000 447G 11273M 436G 2.46 0.06 19 osd.8
-3 0.86819 - 888G 366G 522G 41.26 1.00 - host int2
1 ssd 0.43159 1.00000 441G 167G 274G 37.95 0.92 147 osd.1
4 ssd 0.43660 1.00000 447G 199G 247G 44.54 1.08 173 osd.4
-4 0.86819 - 888G 365G 523G 41.09 1.00 - host int3
2 ssd 0.43159 1.00000 441G 193G 248G 43.71 1.06 174 osd.2
5 ssd 0.43660 1.00000 447G 172G 274G 38.51 0.93 146 osd.5
0 0 0 0 0 0 0 0 0 osd.0
6 0 0 0 0 0 0 0 0 osd.6
7 0 0 0 0 0 0 0 0 osd.7
I guess I can just remove them from crush,auth and rm them?
Kind Regards,
David Majchrzak
26 jan. 2018 kl. 18:09 skrev Reed Dier <reed.dier@xxxxxxxxxxx
<mailto:reed.dier@xxxxxxxxxxx>>:
This is the exact issue that I ran into when starting my bluestore
conversion journey.
See my thread here:
https://www.spinics.net/lists/ceph-users/msg41802.html
Specifying --osd-id causes it to fail.
Below are my steps for OSD replace/migrate from filestore to bluestore.
BIG caveat here in that I am doing destructive replacement, in that I
am not allowing my objects to be migrated off of the OSD I’m
replacing before nuking it.
With 8TB drives it just takes way too long, and I trust my failure
domains and other hardware to get me through the backfills.
So instead of 1) reading data off, writing data elsewhere 2)
remove/re-add 3) reading data elsewhere, writing back on, I am taking
step one out, and trusting my two other copies of the objects. Just
wanted to clarify my steps.
I also set norecover and norebalance flags immediately prior to
running these commands so that it doesn’t try to start moving data
unnecessarily. Then when done, remove those flags, and let it backfill.
systemctl stop ceph-osd@$ID.service <mailto:ceph-osd@$id.service>
ceph-osd -i $ID --flush-journal
umount /var/lib/ceph/osd/ceph-$ID
ceph-volume lvm zap /dev/$ID
ceph osd crush remove osd.$ID
ceph auth del osd.$ID
ceph osd rm osd.$ID
ceph-volume lvm create --bluestore --data /dev/$DATA --block.db
/dev/$NVME
So essentially I fully remove the OSD from crush and the osdmap, and
when I add the OSD back, like I would a new OSD, it fills in the
numeric gap with the $ID it had before.
Hope this is helpful.
Been working well for me so far, doing 3 OSDs at a time (half of a
failure domain).
Reed
On Jan 26, 2018, at 10:01 AM, David <david@xxxxxxxxxx
<mailto:david@xxxxxxxxxx>> wrote:
Hi!
On luminous 12.2.2
I'm migrating some OSDs from filestore to bluestore using the
"simple" method as described in docs:
http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/#convert-existing-osds
Mark out and Replace.
However, at 9.: ceph-volume create --bluestore --data $DEVICE
--osd-id $ID
it seems to create the bluestore but it fails to authenticate with
the old osd-id auth.
(the command above is also missing lvm or simple)
I think it's related to this:
http://tracker.ceph.com/issues/22642
# ceph-volume lvm create --bluestore --data /dev/sdc --osd-id 0
Running command: sudo vgcreate --force --yes
ceph-efad7df8-721d-43d8-8d02-449406e70b90 /dev/sdc
stderr: WARNING: lvmetad is running but disabled. Restart lvmetad
before enabling it!
stdout: Physical volume "/dev/sdc" successfully created
stdout: Volume group "ceph-efad7df8-721d-43d8-8d02-449406e70b90"
successfully created
Running command: sudo lvcreate --yes -l 100%FREE -n
osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
ceph-efad7df8-721d-43d8-8d02-449406e70b90
stderr: WARNING: lvmetad is running but disabled. Restart lvmetad
before enabling it!
stdout: Logical volume
"osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9" created.
Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
Running command: chown -R ceph:ceph /dev/dm-4
Running command: sudo ln -s
/dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
/var/lib/ceph/osd/ceph-0/block
Running command: sudo ceph --cluster ceph --name
client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
/var/lib/ceph/osd/ceph-0/activate.monmap
stderr: got monmap epoch 2
Running command: ceph-authtool /var/lib/ceph/osd/ceph-0/keyring
--create-keyring --name osd.0 --add-key XXXXXXXX
stdout: creating /var/lib/ceph/osd/ceph-0/keyring
stdout: added entity osd.0 auth auth(auid = 18446744073709551615
key= XXXXXXXX with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
Running command: sudo ceph-osd --cluster ceph --osd-objectstore
bluestore --mkfs -i 0 --monmap
/var/lib/ceph/osd/ceph-0/activate.monmap --key
**************************************** --osd-data
/var/lib/ceph/osd/ceph-0/ --osd-uuid
138ce507-f28a-45bf-814c-7fa124a9d9b9 --setuser ceph --setgroup ceph
stderr: 2018-01-26 14:59:10.039549 7fd7ef951cc0 -1
bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable
to decode label at offset 102: buffer::malformed_input: void
bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
past end of struct encoding
stderr: 2018-01-26 14:59:10.039744 7fd7ef951cc0 -1
bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable
to decode label at offset 102: buffer::malformed_input: void
bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
past end of struct encoding
stderr: 2018-01-26 14:59:10.039925 7fd7ef951cc0 -1
bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable
to decode label at offset 102: buffer::malformed_input: void
bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode
past end of struct encoding
stderr: 2018-01-26 14:59:10.039984 7fd7ef951cc0 -1
bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
stderr: 2018-01-26 14:59:11.359951 7fd7ef951cc0 -1 key XXXXXXXX
stderr: 2018-01-26 14:59:11.888476 7fd7ef951cc0 -1 created object
store /var/lib/ceph/osd/ceph-0/ for osd.0 fsid
efad7df8-721d-43d8-8d02-449406e70b90
Running command: sudo ceph-bluestore-tool --cluster=ceph
prime-osd-dir --dev
/dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
--path /var/lib/ceph/osd/ceph-0
Running command: sudo ln -snf
/dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
/var/lib/ceph/osd/ceph-0/block
Running command: chown -R ceph:ceph /dev/dm-4
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: sudo systemctl enable
ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9
stderr: Created symlink from
/etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9.service
<mailto:etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9.service>
to /lib/systemd/system/ceph-volume@.service.
Running command: sudo systemctl start ceph-osd@0
ceph-osd.0.log shows:
2018-01-26 15:09:07.379039 7f545d3b9cc0 4 rocksdb:
[/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2859
<http://version_set.cc:2859/>] Recovered from manifest
file:db/MANIFEST-000095 succeeded,manifest_file_number is 95,
next_file_number is 97, last_sequence is 21, log_number is
0,prev_log_number is 0,max_column_family is 0
2018-01-26 15:09:07.379046 7f545d3b9cc0 4 rocksdb:
[/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2867
<http://version_set.cc:2867/>] Column family [default] (ID 0), log
number is 94
2018-01-26 15:09:07.379087 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1516979347379083, "job": 1, "event":
"recovery_started", "log_files": [96]}
2018-01-26 15:09:07.379091 7f545d3b9cc0 4 rocksdb:
[/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:482
<http://db_impl_open.cc:482/>] Recovering log #96 mode 0
2018-01-26 15:09:07.379102 7f545d3b9cc0 4 rocksdb:
[/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2395
<http://version_set.cc:2395/>] Creating manifest 98
2018-01-26 15:09:07.380466 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1516979347380463, "job": 1, "event":
"recovery_finished"}
2018-01-26 15:09:07.381331 7f545d3b9cc0 4 rocksdb:
[/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:1063
<http://db_impl_open.cc:1063/>] DB pointer 0x556ecb8c3000
2018-01-26 15:09:07.381353 7f545d3b9cc0 1
bluestore(/var/lib/ceph/osd/ceph-0) _open_db opened rocksdb path db
options
compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152
2018-01-26 15:09:07.381616 7f545d3b9cc0 1 freelist init
2018-01-26 15:09:07.381660 7f545d3b9cc0 1
bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation
metadata
2018-01-26 15:09:07.381679 7f545d3b9cc0 1
bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 447 G in 1
extents
2018-01-26 15:09:07.382077 7f545d3b9cc0 0 _get_class not permitted
to load kvs
2018-01-26 15:09:07.382309 7f545d3b9cc0 0 <cls>
/build/ceph-12.2.2/src/cls/cephfs/cls_cephfs.cc:197
<http://cls_cephfs.cc:197/>: loading cephfs
2018-01-26 15:09:07.382583 7f545d3b9cc0 0 _get_class not permitted
to load sdk
2018-01-26 15:09:07.382827 7f545d3b9cc0 0 <cls>
/build/ceph-12.2.2/src/cls/hello/cls_hello.cc:296
<http://cls_hello.cc:296/>: loading cls_hello
2018-01-26 15:09:07.385755 7f545d3b9cc0 0 _get_class not permitted
to load lua
2018-01-26 15:09:07.386073 7f545d3b9cc0 0 osd.0 0 crush map has
features 288232575208783872, adjusting msgr requires for clients
2018-01-26 15:09:07.386078 7f545d3b9cc0 0 osd.0 0 crush map has
features 288232575208783872 was 8705, adjusting msgr requires for mons
2018-01-26 15:09:07.386079 7f545d3b9cc0 0 osd.0 0 crush map has
features 288232575208783872, adjusting msgr requires for osds
2018-01-26 15:09:07.386132 7f545d3b9cc0 0 osd.0 0 load_pgs
2018-01-26 15:09:07.386134 7f545d3b9cc0 0 osd.0 0 load_pgs opened 0 pgs
2018-01-26 15:09:07.386137 7f545d3b9cc0 0 osd.0 0 using
weightedpriority op queue with priority op cut off at 64.
2018-01-26 15:09:07.386580 7f545d3b9cc0 -1 osd.0 0 log_to_monitors
{default=true}
2018-01-26 15:09:07.388077 7f545d3b9cc0 -1 osd.0 0 init
authentication failed: (1) Operation not permitted
The old osd is still there.
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 2.60458 root default
-2 0.86819 host int1
0 ssd 0.43159 osd.0 destroyed 0 1.00000
3 ssd 0.43660 osd.3 up 1.00000 1.00000
-3 0.86819 host int2
1 ssd 0.43159 osd.1 up 1.00000 1.00000
4 ssd 0.43660 osd.4 up 1.00000 1.00000
-4 0.86819 host int3
2 ssd 0.43159 osd.2 up 1.00000 1.00000
5 ssd 0.43660 osd.5 up 1.00000 1.00000
What's the best course of action? Purging osd.0, zapping the device
again and creating without --osd-id set?
Kind Regards,
David Majchrzak
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com