Re: How to use ceph-volume to create multiple OSDs per NVMe disk, and with fixed WAL/DB partition on another device?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 06, 2020 at 10:15:52AM -0000, victorhooi@xxxxxxxxx wrote:
I'm building a new 4-node Proxmox/Ceph cluster, to hold disk images for our VMs. (Ceph version is 15.2.5).

Each node has 6 x NVMe SSDs (4TB), and 1 x Optane drive (960GB).

CPU is AMD Rome 7442, so there should be plenty of CPU capacity to spare.

My aim is to create 4 x OSDs per NVMe SSD (to make more effective use of the NVMe performance) and use the Optane drive to store the WAL/DB partition for each OSD. (I.e. total of 24 x 35GB WAL/DB partitions).

However, I am struggling to get the right ceph-volume command to achieve this.

Thanks to a very kind Redditor, I was able to get close:

/dev/nvme0n1 is an Optane device (900GB).

/dev/nvme2n1 is an Intel NVMe SSD (4TB).

```
# ceph-volume lvm batch --osds-per-device 4 /dev/nvme2n1 --db-devices /dev/nvme0n1

Total OSDs: 4

Solid State VG:
 Targets:   block.db                  Total size: 893.00 GB
 Total LVs: 16                        Size per LV: 223.25 GB
 Devices:   /dev/nvme0n1

 Type            Path                                                    LV Size         % of device
----------------------------------------------------------------------------------------------------
 [data]          /dev/nvme2n1                                            931.25 GB       25.0%
 [block.db]      vg: vg/lv                                               223.25 GB       25%
----------------------------------------------------------------------------------------------------
 [data]          /dev/nvme2n1                                            931.25 GB       25.0%
 [block.db]      vg: vg/lv                                               223.25 GB       25%
----------------------------------------------------------------------------------------------------
 [data]          /dev/nvme2n1                                            931.25 GB       25.0%
 [block.db]      vg: vg/lv                                               223.25 GB       25%
----------------------------------------------------------------------------------------------------
 [data]          /dev/nvme2n1                                            931.25 GB       25.0%
 [block.db]      vg: vg/lv                                               223.25 GB       25%
--> The above OSDs would be created if the operation continues
--> do you want to proceed? (yes/no)
```

This does split up the NVMe disk into 4 OSDs, and creates WAL/DB partition on the Optane drive - however, it creates 4 x 223 GB partitions on the Optane (whereas I want 35GB partitions).

Is there any way to specify the WAL/DB partition size in the above?

And can it be done, such that you can run successive ceph-volume commands, to add the OSDs and WAL/DB partitions for each NVMe disk?
Is there is particular reason you want to run ceph-volume multiple times? The batch subcommand can handle that in one go without the need to explicitly specify any sizes as another reply proposed (though that will work nicely).

Something like this should get you there:
ceph-volume lvm batch --osds-per-device 4 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 --db-devices /dev/nvme0n1

This of course makes assumption regarding device names, adjust accordingly.

Another option to size the volumes on the Optane drive would be to rely on the *slots arguments of the batch subcommand. See either ceph-volume lvm batch --help or https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/#implicit-sizing


(Or if there's an easier way to achieve the above layout, please let me know).

That being said - I also just saw this ceph-users thread:

https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/3Y6DEJCF7ZMXJL2NRLXUUEX76W7PPYXK/

It talks there about "osd op num shards" and "osd op num threads per shard" - is there some way to set those, to achieve similar performance to say, 4 x OSDs per NVMe drive, but with only 1 x NVMe? Has anybody done any testing/benchmarking on this they can share?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux