NVMe disk - size

Kristof Coucke <kristof.coucke@xxxxxxxxx> · Fri, 15 Nov 2019 15:16:27 +0100

Hi all,

We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks (14TB) and 2 NVMe disks (1,6TB).
The idea was to use the NVMe as “fast device”…
The recommendations I’ve read in the online documentation, state that the db block device should be around 4%~5% of the slow device. So, the block.db should be somewhere between 600GB and 700GB as a best practice.
However… I was thinking to only reserve 200GB per OSD as fast device… Which is 1/3 of the recommendation…

I’ve tested in the labs, and it does work fine with even very small devices (the spillover does its job).
Though, before taking the system in to production, I would like to verify that no issues arise.

Is it recommended to still use it as a block.db, or is it recommended to only use it as a WAL device?
Should I just split the NVMe in three and only configure 3 OSDs to use the system? (This would mean that the performace shall be degraded to the speed of the slowest device…)

The initial cluster is +1PB and we’re planning to expand it again with 1PB in the near future to migrate our data.
We’ll only use the system thru the RGW (No CephFS, nor block device), and we’ll store “a lot” of small files on it… (Millions of files a day)

The reason I’m asking it, is that I’ve been able to break the test system (long story), causing OSDs to fail as they ran out of space… Expanding the disks (the block DB device as well as the main block device) failed with the ceph-bluestore-tool…

Thanks for your answer!

Kristof
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com