Re: Ceph All-SSD Cluster & Wal/DB Separation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 Sent prematurely.

I meant to add that after ~3 years of service, the 1 DWPD drives in the clusters I mentioned mostly reported <10% of endurance burned.

Required endurance is in part a function of how long you expect the drives to last.

>> Having said that, for a storage cluster where write performance is expected to be the main bottleneck, I would be hesitant to use drives that only have 1DWPD endurance since Ceph has fairly high write amplification factors. If you use 3-fold replication, this cluster might only be able to handle a few TB of writes per day without wearing out the drives prematurely.
> 
>> 
>>> Hi Experts,I am seeking for if there is achievable significant write performance improvements when separating WAL/DB in a ceph cluster with all SSD type OSD.I have a cluster with 40 SSD (PM1643 1.8 TB SSD Enterprise Samsung). I have 10 Storage node each with 4 OSD. I want to know that can I get better write IOPs and throughput if I add one NVMe OSD per node and separate WAL/DB on it?Is the result of this separation, meaningful performance improvement or not?
>>> My ceph cluster is block storage back-end of Openstack cinder in a public cloud service.
> 
> 
> My zwei pfennig:
> 
> * IMHO the performance delta with external WAL+DB is going to be limited.  NVNe WAL+DB would deliver lower write latency up to a point, but throughput is still going to be limited by the SAS HBA / bulk OSD drives.  You also have the hassle of managing OSDs that span devices: when replacing a failed OSD properly handling the shared device can be tricky.  With your very small number of nodes and drives, the blast radius of one failing would be really large.
> 
> * Do you have the libvirt / librbd client-side cache disabled?
> 
> * I’ve run 3R clusters in a similar role, backing libvirt / librbd clients and using SATA SSDs.  We mostly were able to sustain an average write latency <= 5ms, though a couple of times we had to expand a cluster for IOPs before capacity.  The crappy HBAs in use were part of the bottleneck.  This sort of thing is one of the inputs to the SNIA TCO calculator.
> 
> 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux