Hi community I'm currently facing a significant issue with my Ceph cluster. I have a cluster consisting of 10 nodes, and each node is equipped with 6 SSDs of 960GB used for block.db and 18 12TB drives used for data, network bonding 2x10Gbps for public and local networks. I am using a 4+2 erasure code for RBD in my Ceph cluster. When one node becomes unavailable, the cluster initiates the recovery process, and subsequently, slow operations (slowops) logs appear on the disk, impacting the entire cluster. Afterward, additional nodes are marked as failures. Is this phenomenon possibly due to the performance of SSDs and HDDs? When I check the I/O of the disk using the iostat command, the result shows that disk utilization has reached 80-90% Is using a combination of HDDs and SAS SSDs in Ceph a choice leading to poor performance? My Ceph cluster has a bandwidth of 1.9GB/s Thanks and hope someone can help me ---------------------------------------------------------------------------- Email: tranphong079@xxxxxxxxx Skype: tranphong079 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx