On Fri, Nov 06, 2020 at 10:15:52AM -0000, victorhooi@xxxxxxxxx wrote:
I'm building a new 4-node Proxmox/Ceph cluster, to hold disk images for our VMs. (Ceph version is 15.2.5).
Each node has 6 x NVMe SSDs (4TB), and 1 x Optane drive (960GB).
CPU is AMD Rome 7442, so there should be plenty of CPU capacity to spare.
My aim is to create 4 x OSDs per NVMe SSD (to make more effective use of the NVMe performance) and use the Optane drive to store the WAL/DB partition for each OSD. (I.e. total of 24 x 35GB WAL/DB partitions).
However, I am struggling to get the right ceph-volume command to achieve this.
Thanks to a very kind Redditor, I was able to get close:
/dev/nvme0n1 is an Optane device (900GB).
/dev/nvme2n1 is an Intel NVMe SSD (4TB).
```
# ceph-volume lvm batch --osds-per-device 4 /dev/nvme2n1 --db-devices /dev/nvme0n1
Total OSDs: 4
Solid State VG:
Targets: block.db Total size: 893.00 GB
Total LVs: 16 Size per LV: 223.25 GB
Devices: /dev/nvme0n1
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
[data] /dev/nvme2n1 931.25 GB 25.0%
[block.db] vg: vg/lv 223.25 GB 25%
----------------------------------------------------------------------------------------------------
[data] /dev/nvme2n1 931.25 GB 25.0%
[block.db] vg: vg/lv 223.25 GB 25%
----------------------------------------------------------------------------------------------------
[data] /dev/nvme2n1 931.25 GB 25.0%
[block.db] vg: vg/lv 223.25 GB 25%
----------------------------------------------------------------------------------------------------
[data] /dev/nvme2n1 931.25 GB 25.0%
[block.db] vg: vg/lv 223.25 GB 25%
--> The above OSDs would be created if the operation continues
--> do you want to proceed? (yes/no)
```
This does split up the NVMe disk into 4 OSDs, and creates WAL/DB partition on the Optane drive - however, it creates 4 x 223 GB partitions on the Optane (whereas I want 35GB partitions).
Is there any way to specify the WAL/DB partition size in the above?
And can it be done, such that you can run successive ceph-volume commands, to add the OSDs and WAL/DB partitions for each NVMe disk?
Is there is particular reason you want to run ceph-volume multiple times? The
batch subcommand can handle that in one go without the need to explicitly
specify any sizes as another reply proposed (though that will work nicely).
Something like this should get you there:
ceph-volume lvm batch --osds-per-device 4 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 --db-devices /dev/nvme0n1
This of course makes assumption regarding device names, adjust accordingly.
Another option to size the volumes on the Optane drive would be to rely on the
*slots arguments of the batch subcommand. See either ceph-volume lvm batch --help
or https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/#implicit-sizing
(Or if there's an easier way to achieve the above layout, please let me know).
That being said - I also just saw this ceph-users thread:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3Y6DEJCF7ZMXJL2NRLXUUEX76W7PPYXK/
It talks there about "osd op num shards" and "osd op num threads per shard" - is there some way to set those, to achieve similar performance to say, 4 x OSDs per NVMe drive, but with only 1 x NVMe? Has anybody done any testing/benchmarking on this they can share?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx