> Having said that, for a storage cluster where write performance is expected to be the main bottleneck, I would be hesitant to use drives that only have 1DWPD endurance since Ceph has fairly high write amplification factors. If you use 3-fold replication, this cluster might only be able to handle a few TB of writes per day without wearing out the drives prematurely. > >> Hi Experts,I am seeking for if there is achievable significant write performance improvements when separating WAL/DB in a ceph cluster with all SSD type OSD.I have a cluster with 40 SSD (PM1643 1.8 TB SSD Enterprise Samsung). I have 10 Storage node each with 4 OSD. I want to know that can I get better write IOPs and throughput if I add one NVMe OSD per node and separate WAL/DB on it?Is the result of this separation, meaningful performance improvement or not? >> My ceph cluster is block storage back-end of Openstack cinder in a public cloud service. My zwei pfennig: * IMHO the performance delta with external WAL+DB is going to be limited. NVNe WAL+DB would deliver lower write latency up to a point, but throughput is still going to be limited by the SAS HBA / bulk OSD drives. You also have the hassle of managing OSDs that span devices: when replacing a failed OSD properly handling the shared device can be tricky. With your very small number of nodes and drives, the blast radius of one failing would be really large. * Do you have the libvirt / librbd client-side cache disabled? * I’ve run 3R clusters in a similar role, backing libvirt / librbd clients and using SATA SSDs. We mostly were able to sustain an average write latency <= 5ms, though a couple of times we had to expand a cluster for IOPs before capacity. The crappy HBAs in use were part of the bottleneck. This sort of thing is one of the inputs to the SNIA TCO calculator. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx