On Fri, Jan 26, 2018 at 5:00 PM, Reed Dier <reed.dier@xxxxxxxxxxx> wrote: > Bit late for this to be helpful, but instead of zapping the lvm labels, you > could alternatively destroy the lvm volume by hand. > > lvremove -f <volume_group>/<logical_volume> > vgremove <volume_group> > pvremove /dev/ceph-device (should wipe labels) > > > Then you should be able to run ‘ceph-volume lvm zap /dev/sdX’ and retry the > 'ceph-volume lvm create’ command (sans --osd-id flag) and it should run as > well. > > This info will hopefully be useful for those not as well versed with lvm as > I am/was at the time I needed this info. > > Reed > > On Jan 26, 2018, at 11:32 AM, David Majchrzak <david@xxxxxxxxxx> wrote: > > Thanks that helped! > > Since I had already "halfway" created a lvm volume I wanted to start from > the beginning and zap it. > > Tried to zap the raw device but failed since --destroy doesn't seem to be in > 12.2.2 > > http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/ > > root@int1:~# ceph-volume lvm zap /dev/sdc --destroy > usage: ceph-volume lvm zap [-h] [DEVICE] > ceph-volume lvm zap: error: unrecognized arguments: --destroy > > So i zapped it with the vg/lvm instead. > ceph-volume lvm zap > /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 > > However I run create on it since the LVM was already there. > So I zapped it with sgdisk and ran dmsetup remove. After that I was able to > create it again. > > However - each "ceph-volume lvm create" that I ran that failed, successfully > added an osd to crush map ;) You are hitting a few known issues here, and some missing features that are already in master (planned for Mimic) but not in Luminous. * --osd-id cannot be used currently, just make sure the OSD is destroyed so that ceph-volume can pick the next ID available (Not fixed yet: http://tracker.ceph.com/issues/22642) * When creating an OSD and this fails, the ID is created and leftover (Fixed with: http://tracker.ceph.com/issues/22704) * --destroy helps with full removal of logical volumes and their groups when zapping (fixed, only in master http://tracker.ceph.com/issues/22653) To recap, borrowing from all the good suggestions here: * Don't use --osd-id, just let ceph-volume grab the next one available * Ensure that the ID is fully removed, including the auth * If deploying the OSD fails, repeat the manual OSD removal, and remove the vg/lv by hand with: `sudo vgremove <vg>` (this will remove the lv and pv associated with it) > > So I've got this now: > > root@int1:~# ceph osd df tree > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME > -1 2.60959 - 2672G 1101G 1570G 41.24 1.00 - root default > -2 0.87320 - 894G 369G 524G 41.36 1.00 - host int1 > 3 ssd 0.43660 1.00000 447G 358G 90295M 80.27 1.95 301 osd.3 > 8 ssd 0.43660 1.00000 447G 11273M 436G 2.46 0.06 19 osd.8 > -3 0.86819 - 888G 366G 522G 41.26 1.00 - host int2 > 1 ssd 0.43159 1.00000 441G 167G 274G 37.95 0.92 147 osd.1 > 4 ssd 0.43660 1.00000 447G 199G 247G 44.54 1.08 173 osd.4 > -4 0.86819 - 888G 365G 523G 41.09 1.00 - host int3 > 2 ssd 0.43159 1.00000 441G 193G 248G 43.71 1.06 174 osd.2 > 5 ssd 0.43660 1.00000 447G 172G 274G 38.51 0.93 146 osd.5 > 0 0 0 0 0 0 0 0 0 osd.0 > 6 0 0 0 0 0 0 0 0 osd.6 > 7 0 0 0 0 0 0 0 0 osd.7 > > I guess I can just remove them from crush,auth and rm them? > > Kind Regards, > > David Majchrzak > > 26 jan. 2018 kl. 18:09 skrev Reed Dier <reed.dier@xxxxxxxxxxx>: > > This is the exact issue that I ran into when starting my bluestore > conversion journey. > > See my thread here: https://www.spinics.net/lists/ceph-users/msg41802.html > > Specifying --osd-id causes it to fail. > > Below are my steps for OSD replace/migrate from filestore to bluestore. > > BIG caveat here in that I am doing destructive replacement, in that I am not > allowing my objects to be migrated off of the OSD I’m replacing before > nuking it. > With 8TB drives it just takes way too long, and I trust my failure domains > and other hardware to get me through the backfills. > So instead of 1) reading data off, writing data elsewhere 2) remove/re-add > 3) reading data elsewhere, writing back on, I am taking step one out, and > trusting my two other copies of the objects. Just wanted to clarify my > steps. > > I also set norecover and norebalance flags immediately prior to running > these commands so that it doesn’t try to start moving data unnecessarily. > Then when done, remove those flags, and let it backfill. > > systemctl stop ceph-osd@$ID.service > ceph-osd -i $ID --flush-journal > umount /var/lib/ceph/osd/ceph-$ID > ceph-volume lvm zap /dev/$ID > ceph osd crush remove osd.$ID > ceph auth del osd.$ID > ceph osd rm osd.$ID > ceph-volume lvm create --bluestore --data /dev/$DATA --block.db /dev/$NVME > > > So essentially I fully remove the OSD from crush and the osdmap, and when I > add the OSD back, like I would a new OSD, it fills in the numeric gap with > the $ID it had before. > > Hope this is helpful. > Been working well for me so far, doing 3 OSDs at a time (half of a failure > domain). > > Reed > > On Jan 26, 2018, at 10:01 AM, David <david@xxxxxxxxxx> wrote: > > > Hi! > > On luminous 12.2.2 > > I'm migrating some OSDs from filestore to bluestore using the "simple" > method as described in docs: > http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/#convert-existing-osds > Mark out and Replace. > > However, at 9.: ceph-volume create --bluestore --data $DEVICE --osd-id $ID > it seems to create the bluestore but it fails to authenticate with the old > osd-id auth. > (the command above is also missing lvm or simple) > > I think it's related to this: > http://tracker.ceph.com/issues/22642 > > # ceph-volume lvm create --bluestore --data /dev/sdc --osd-id 0 > Running command: sudo vgcreate --force --yes > ceph-efad7df8-721d-43d8-8d02-449406e70b90 /dev/sdc > stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before > enabling it! > stdout: Physical volume "/dev/sdc" successfully created > stdout: Volume group "ceph-efad7df8-721d-43d8-8d02-449406e70b90" > successfully created > Running command: sudo lvcreate --yes -l 100%FREE -n > osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 > ceph-efad7df8-721d-43d8-8d02-449406e70b90 > stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before > enabling it! > stdout: Logical volume "osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9" > created. > Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0 > Running command: chown -R ceph:ceph /dev/dm-4 > Running command: sudo ln -s > /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 > /var/lib/ceph/osd/ceph-0/block > Running command: sudo ceph --cluster ceph --name client.bootstrap-osd > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o > /var/lib/ceph/osd/ceph-0/activate.monmap > stderr: got monmap epoch 2 > Running command: ceph-authtool /var/lib/ceph/osd/ceph-0/keyring > --create-keyring --name osd.0 --add-key XXXXXXXX > stdout: creating /var/lib/ceph/osd/ceph-0/keyring > stdout: added entity osd.0 auth auth(auid = 18446744073709551615 key= > XXXXXXXX with 0 caps) > Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring > Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/ > Running command: sudo ceph-osd --cluster ceph --osd-objectstore bluestore > --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --key > **************************************** --osd-data > /var/lib/ceph/osd/ceph-0/ --osd-uuid 138ce507-f28a-45bf-814c-7fa124a9d9b9 > --setuser ceph --setgroup ceph > stderr: 2018-01-26 14:59:10.039549 7fd7ef951cc0 -1 > bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode > label at offset 102: buffer::malformed_input: void > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past > end of struct encoding > stderr: 2018-01-26 14:59:10.039744 7fd7ef951cc0 -1 > bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode > label at offset 102: buffer::malformed_input: void > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past > end of struct encoding > stderr: 2018-01-26 14:59:10.039925 7fd7ef951cc0 -1 > bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode > label at offset 102: buffer::malformed_input: void > bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past > end of struct encoding > stderr: 2018-01-26 14:59:10.039984 7fd7ef951cc0 -1 > bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid > stderr: 2018-01-26 14:59:11.359951 7fd7ef951cc0 -1 key XXXXXXXX > stderr: 2018-01-26 14:59:11.888476 7fd7ef951cc0 -1 created object store > /var/lib/ceph/osd/ceph-0/ for osd.0 fsid > efad7df8-721d-43d8-8d02-449406e70b90 > Running command: sudo ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev > /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 > --path /var/lib/ceph/osd/ceph-0 > Running command: sudo ln -snf > /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 > /var/lib/ceph/osd/ceph-0/block > Running command: chown -R ceph:ceph /dev/dm-4 > Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0 > Running command: sudo systemctl enable > ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9 > stderr: Created symlink from > /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9.service > to /lib/systemd/system/ceph-volume@.service. > Running command: sudo systemctl start ceph-osd@0 > > ceph-osd.0.log shows: > > 2018-01-26 15:09:07.379039 7f545d3b9cc0 4 rocksdb: > [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2859] Recovered from > manifest file:db/MANIFEST-000095 succeeded,manifest_file_number is 95, > next_file_number is 97, last_sequence is 21, log_number is 0,prev_log_number > is 0,max_column_family is 0 > > 2018-01-26 15:09:07.379046 7f545d3b9cc0 4 rocksdb: > [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2867] Column family > [default] (ID 0), log number is 94 > > 2018-01-26 15:09:07.379087 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1516979347379083, "job": 1, "event": "recovery_started", > "log_files": [96]} > 2018-01-26 15:09:07.379091 7f545d3b9cc0 4 rocksdb: > [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:482] Recovering log #96 > mode 0 > 2018-01-26 15:09:07.379102 7f545d3b9cc0 4 rocksdb: > [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2395] Creating manifest 98 > > 2018-01-26 15:09:07.380466 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1516979347380463, "job": 1, "event": "recovery_finished"} > 2018-01-26 15:09:07.381331 7f545d3b9cc0 4 rocksdb: > [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:1063] DB pointer > 0x556ecb8c3000 > 2018-01-26 15:09:07.381353 7f545d3b9cc0 1 > bluestore(/var/lib/ceph/osd/ceph-0) _open_db opened rocksdb path db options > compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152 > 2018-01-26 15:09:07.381616 7f545d3b9cc0 1 freelist init > 2018-01-26 15:09:07.381660 7f545d3b9cc0 1 > bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation metadata > 2018-01-26 15:09:07.381679 7f545d3b9cc0 1 > bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 447 G in 1 extents > 2018-01-26 15:09:07.382077 7f545d3b9cc0 0 _get_class not permitted to load > kvs > 2018-01-26 15:09:07.382309 7f545d3b9cc0 0 <cls> > /build/ceph-12.2.2/src/cls/cephfs/cls_cephfs.cc:197: loading cephfs > 2018-01-26 15:09:07.382583 7f545d3b9cc0 0 _get_class not permitted to load > sdk > 2018-01-26 15:09:07.382827 7f545d3b9cc0 0 <cls> > /build/ceph-12.2.2/src/cls/hello/cls_hello.cc:296: loading cls_hello > 2018-01-26 15:09:07.385755 7f545d3b9cc0 0 _get_class not permitted to load > lua > 2018-01-26 15:09:07.386073 7f545d3b9cc0 0 osd.0 0 crush map has features > 288232575208783872, adjusting msgr requires for clients > 2018-01-26 15:09:07.386078 7f545d3b9cc0 0 osd.0 0 crush map has features > 288232575208783872 was 8705, adjusting msgr requires for mons > 2018-01-26 15:09:07.386079 7f545d3b9cc0 0 osd.0 0 crush map has features > 288232575208783872, adjusting msgr requires for osds > 2018-01-26 15:09:07.386132 7f545d3b9cc0 0 osd.0 0 load_pgs > 2018-01-26 15:09:07.386134 7f545d3b9cc0 0 osd.0 0 load_pgs opened 0 pgs > 2018-01-26 15:09:07.386137 7f545d3b9cc0 0 osd.0 0 using weightedpriority op > queue with priority op cut off at 64. > 2018-01-26 15:09:07.386580 7f545d3b9cc0 -1 osd.0 0 log_to_monitors > {default=true} > 2018-01-26 15:09:07.388077 7f545d3b9cc0 -1 osd.0 0 init authentication > failed: (1) Operation not permitted > > > The old osd is still there. > > # ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 2.60458 root default > -2 0.86819 host int1 > 0 ssd 0.43159 osd.0 destroyed 0 1.00000 > 3 ssd 0.43660 osd.3 up 1.00000 1.00000 > -3 0.86819 host int2 > 1 ssd 0.43159 osd.1 up 1.00000 1.00000 > 4 ssd 0.43660 osd.4 up 1.00000 1.00000 > -4 0.86819 host int3 > 2 ssd 0.43159 osd.2 up 1.00000 1.00000 > 5 ssd 0.43660 osd.5 up 1.00000 1.00000 > > > What's the best course of action? Purging osd.0, zapping the device again > and creating without --osd-id set? > > > Kind Regards, > > David Majchrzak > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com