This thread is off in left field and needs to be
brought back to how things work.
While multiple OSDs can use the same device for block/wal
partitions, they each need their own partition. osd.0 could
use nvme0n1p1, osd.2/nvme0n1p2, etc. You cannot use the same
partition for each osd. Ceph-volume will not create the
db/wal partitions for you, you need to manually create the
partitions to be used by the OSD. There is no need to put a
filesystem on top of the partition for the wal/db. That is
wasted overhead that will slow things down.
Back to the original email.
> Or do I need to use
osd-db=/dev/nvme0n1p2 for data="">
> osd-db=/dev/nvme0n1p3
for data="" and so on?
This is what you need to do, but like said above, you need to
create the partitions for --block-db yourself. You talked
about having a 10GB partition for this, but the general
recommendation for block-db partitions is 10GB per 1TB of
OSD. If your OSD is a 4TB disk you should be looking closer
to a 40GB block.db partition. If your block.db partition is
too small, then once it fills up it will spill over onto the
data volume and slow things down.
> And just to make sure -
if I specify "--osd-db", I don't need
> to set "--osd-wal" as
well, since the WAL will end up on the
> DB partition
automatically, correct?
This is correct. The wal will automatically be placed on
the db if not otherwise specified.
I don't use ceph-deploy, but the process for creating the
OSDs should be something like this. After the OSDs are
created it is a good idea to make sure that the OSD is not
looking for the db partition with the /dev/nvme0n1p2
distinction as that can change on reboots if you have multiple
nvme devices.
# Make sure the disks are clean and ready to use as an OSD
for hdd in /dev/sd{b..c}; do
ceph-volume lvm zap $hdd --destroy
done
# Create the nvme db partitions (assuming 10G size for a
1TB OSD)
for partition in {2..3}; do
sgdisk -c /dev/nvme0n1 -n:$partition:0:+10G
-c:$partition:'ceph db'
done
# Create the OSD
echo "/dev/sdb /dev/nvme0n1p2
/dev/sdc /dev/nvme0n1p3" | while read hdd db; do
ceph-volume lvm create --bluestore --data $hdd --block.db
$db
done
# Fix the OSDs to look for the block.db partition by UUID
instead of its device name.
for db in /var/lib/ceph/osd/*/block.db; do
dev=$(readlink $db | grep -Eo
nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+ || echo false)
if [[ "$dev" != false ]]; then
uuid=$(ls -l /dev/disk/by-partuuid/ | awk
'/'${dev}'$/ {print $9}')
ln -sf /dev/disk/by-partuuid/$uuid $db
fi
done
systemctl restart ceph-osd.target