Hi Len, Indeed, this is not possible with ceph-ansible. One option would be to do it manually with `ceph-volume lvm migrate`: (Note that it can be tedious given that it requires a lot of manual operations, especially for clusters with a large number of OSDs.) Initial setup: ``` # cat group_vars/all --- devices: - /dev/sdb dedicated_devices: - /dev/sda ``` ``` [root@osd0 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk `-ceph--8d085f45--939c--4a65--a577--d21fa146d7d6-osd--db--cd34400d--daf2--450f--97d9--d561e7a43d1a 252:1 0 50G 0 lvm sdb 8:16 0 50G 0 disk `-ceph--4c77295c--28a5--440a--9561--b9dc4c814e36-osd--block--70fd3b96--7bb2--4ae3--a0f8--4d18748186f9 252:0 0 50G 0 lvm sdc 8:32 0 50G 0 disk sdd 8:48 0 50G 0 disk vda 253:0 0 11G 0 disk `-vda1 253:1 0 10G 0 part / ``` ``` [root@osd0 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert osd-block-70fd3b96-7bb2-4ae3-a0f8-4d18748186f9 ceph-4c77295c-28a5-440a-9561-b9dc4c814e36 -wi-ao---- <50.00g osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a ceph-8d085f45-939c-4a65-a577-d21fa146d7d6 -wi-ao---- <50.00g [root@osd0 ~]# vgs VG #PV #LV #SN Attr VSize VFree ceph-4c77295c-28a5-440a-9561-b9dc4c814e36 1 1 0 wz--n- <50.00g 0 ceph-8d085f45-939c-4a65-a577-d21fa146d7d6 1 1 0 wz--n- <50.00g 0 ``` Create a tmp LV on your new device: ``` [root@osd0 ~]# pvcreate /dev/sdd Physical volume "/dev/sdd" successfully created. [root@osd0 ~]# vgcreate vg_db_tmp /dev/sdd Volume group "vg_db_tmp" successfully created [root@osd0 ~]# lvcreate -n db-sdb -l 100%FREE vg_db_tmp Logical volume "db-sdb" created. ``` stop your osd: ``` [root@osd0 ~]# systemctl stop ceph-osd@0 ``` Migrate the db to the tmp lv: ``` [root@osd0 ~]# ceph-volume lvm migrate --osd-id 0 --osd-fsid 70fd3b96-7bb2-4ae3-a0f8-4d18748186f9 --from db --target vg_db_tmp/db-sdb --> Migrate to new, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-0/block.db'] Target: /dev/vg_db_tmp/db-sdb Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block.db Running command: /bin/chown -R ceph:ceph /dev/dm-2 --> Migration successful. ``` remove the old lv: ``` [root@osd0 ~]# lvremove /dev/ceph-8d085f45-939c-4a65-a577-d21fa146d7d6/osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a Do you really want to remove active logical volume ceph-8d085f45-939c-4a65-a577-d21fa146d7d6/osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a? [y/n]: y Logical volume "osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a" successfully removed. ``` recreate a smaller LV. in my simplified case, I want to go from 1 to 2 db device. it means that my old LV has to be resized down to 1/2: ``` [root@osd0 ~]# lvcreate -n osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a -l 50%FREE ceph-8d085f45-939c-4a65-a577-d21fa146d7d6 Logical volume "osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a" created. ``` Migrate the db to the new LV: ``` [root@osd0 ~]# ceph-volume lvm migrate --osd-id 0 --osd-fsid 70fd3b96-7bb2-4ae3-a0f8-4d18748186f9 --from db --target ceph-8d085f45-939c-4a65-a577-d21fa146d7d6/osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a --> Migrate to new, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-0/block.db'] Target: /dev/ceph-8d085f45-939c-4a65-a577-d21fa146d7d6/osd-db-cd34400d-daf2-450f-97d9-d561e7a43d1a Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block.db Running command: /bin/chown -R ceph:ceph /dev/dm-1 --> Migration successful. ``` restart the osd: ``` [root@osd0 ~]# systemctl start ceph-osd@0 ``` remove tmp lv/vg/pv: ``` [root@osd0 ~]# lvremove /dev/vg_db_tmp/db-sdb Do you really want to remove active logical volume vg_db_tmp/db-sdb? [y/n]: y [root@osd0 ~]# vgremove vg_db_tmp Volume group "vg_db_tmp" successfully removed [root@osd0 ~]# pvremove /dev/sdd Labels on physical volume "/dev/sdd" successfully wiped. ``` add the new osd (should be done by re-running the playbook): ``` [root@osd0 ~]# ceph-volume lvm batch --bluestore --yes /dev/sdb /dev/sdc --db-devices /dev/sda --> passed data devices: 2 physical, 0 LVM --> relative data size: 1.0 --> passed block_db devices: 1 physical, 0 LVM ... omitted output ... --> ceph-volume lvm activate successful for osd ID: 1 --> ceph-volume lvm create successful for: /dev/sdc [root@osd0 ~]# ``` new lsblk output: ``` [root@osd0 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 50G 0 disk |-ceph--8d085f45--939c--4a65--a577--d21fa146d7d6-osd--db--cd34400d--daf2--450f--97d9--d561e7a43d1a 252:0 0 25G 0 lvm `-ceph--8d085f45--939c--4a65--a577--d21fa146d7d6-osd--db--bb30e5aa--a634--4c52--8b99--a222c03c18e3 252:3 0 25G 0 lvm sdb 8:16 0 50G 0 disk `-ceph--4c77295c--28a5--440a--9561--b9dc4c814e36-osd--block--70fd3b96--7bb2--4ae3--a0f8--4d18748186f9 252:1 0 50G 0 lvm sdc 8:32 0 50G 0 disk `-ceph--5255bfbb--f133--4954--aaa8--35e2643ed491-osd--block--9e67ea46--2409--45f8--83e1--f66a42a6d9d0 252:2 0 50G 0 lvm sdd 8:48 0 50G 0 disk vda 253:0 0 11G 0 disk `-vda1 253:1 0 10G 0 part / ``` If you plan to re-run the playbook, do not forget to update your group_vars to reflect the new topology: ``` # cat group_vars/all --- devices: - /dev/sdb - /dev/sdc dedicated_devices: - /dev/sda ``` You might want to use some osd flags (noout, etc..) in order to avoid unnecessary data migration. Regards, On Tue, 17 Jan 2023 at 18:39, Len Kimms <len.kimms@xxxxxxxxxxxxxxx> wrote: > Hello all, > > we’ve set up a new Ceph cluster with a number of nodes which are all > identically configured. > There is one device vda which should act as WAL device for all other > devices. Additionally, there are four other devices vdb, vdc, vdd, vde > which use vda as WAL. > The whole cluster was set up using ceph-ansible (branch stable-7.0) and > Ceph version 17.2.0. > Device configuration in osds.yml looks as follows: > devices: [/dev/vdb, /dev/vdc, /dev/vdd, /dev/vde] > bluestore_wal_devices: [/dev/vda] > As expected vda contains four logical volumes for WAL each 1/4 of the > overall vda disk size (‘ceph-ansible/group_vars/all.yml’ has default > ‘block_db_size: -1’). > > After the initial setup, we’ve added an additional device vdf which should > become a new OSD. The new OSD should use vda for WAL as well. This means > the previous four WAL LVs have to be resized down to 1/5 and a new LV has > to be added. > > Is it possible to retroactively add a new device to an already provisioned > WAL device? > > We suspect that this is not possible because the ceph-bluestore-tool does > not provide any way to shrink an existing BlueFS device. Only expanding is > currently possible ( > https://docs.ceph.com/en/quincy/man/8/ceph-bluestore-tool/). > Simply adding the new device to the devices list and rerunning the > playbook does nothing. And so does only setting “devices: [/dev/vdf]” and > “bluestore_wal_devices: [/dev/vda]”. In both cases vda is rejected because > “Insufficient space (<10 extents) on vgs” which makes sense because vda is > already fully used by the previous four OSD WALs. > > Thanks for the help and kind regards. > > > Additional notes: > - We’re testing pre-production on an emulated cluster hence the device > names vdx and unusually small device sizes. > - The output of `lsblk` after the initial setup looks as follows: > ``` > vda > 252:0 0 8G 0 disk > ├─ceph--36607c7f--e51c--452e--a44a--225d8d0b0aa8-osd--wal--3677c354--8d7d--4db9--a2b7--68aeb8248d40 > 253:2 0 2G 0 lvm > ├─ceph--36607c7f--e51c--452e--a44a--225d8d0b0aa8-osd--wal--52d71122--b573--4077--9633--968c178612fd > 253:4 0 2G 0 lvm > ├─ceph--36607c7f--e51c--452e--a44a--225d8d0b0aa8-osd--wal--2d7eb467--cfb1--4a00--8a45--273932036599 > 253:6 0 2G 0 lvm > └─ceph--36607c7f--e51c--452e--a44a--225d8d0b0aa8-osd--wal--d7b13b79--219c--4002--9e92--370dff7a5376 > 253:8 0 2G 0 lvm > vdb > 252:16 0 8G 0 disk > └─ceph--49ddaa8b--5d8f--4267--85f9--5cac608ce53d-osd--block--861a53c7--ee57--4c5f--9546--1dd7cb0185ef > 253:1 0 8G 0 lvm > vdc > 252:32 0 5G 0 disk > └─ceph--1ed9ee91--e071--4ea4--9703--d56d84d9ae0a-osd--block--8aacb66a--e29b--4b7a--8ad5--a9fb1f81c6d6 > 253:3 0 5G 0 lvm > vdd > 252:48 0 5G 0 disk > └─ceph--554cdd8b--e722--41a9--8f64--c09c857cc0dc-osd--block--4dee3e1b--b50d--4154--b2ff--80cadb67e2a0 > 253:5 0 5G 0 lvm > vde > 252:64 0 5G 0 disk > └─ceph--5d58de32--ca55--4895--8ac7--af94ee07672e-osd--block--3f563f40--0c1e--4cca--9325--d9534cceb711 > 253:7 0 5G 0 lvm > vdf > 252:80 0 5G 0 disk > ``` > - Ceph status is happy and healthy: > ``` > cluster: > id: ff043ce8-xxxx-xxxx-xxxx-e98d073c9d09 > health: HEALTH_WARN > mons are allowing insecure global_id reclaim > > services: > mon: 3 daemons, quorum baloo-1,baloo-2,baloo-3 (age 13m) > mgr: baloo-2(active, since 5m), standbys: baloo-3, baloo-1 > mds: 1/1 daemons up, 1 standby > osd: 24 osds: 24 up (since 4m), 24 in (since 5m) > rgw: 1 daemon active (1 hosts, 1 zones) > > data: > volumes: 1/1 healthy > pools: 7 pools, 177 pgs > objects: 213 objects, 584 KiB > usage: 98 MiB used, 138 GiB / 138 GiB avail > pgs: 177 active+clean > ``` > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- *Guillaume Abrioux*Senior Software Engineer _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx