The Ceph version is 17.2.7. • OSDs are a mix of SSD and HDD, with DB/WAL colocated on the same OSDs. • SSDs are used for metadata and index pools with replication 3. • HDDs store the data pool using EC 4+2. Interestingly, the same issue has appeared on another cluster where DB/WAL is placed on NVMe disks, but the pool distribution is the same: meta and index on SSDs, and data on HDDs. It seems to be network-related, as I’ve checked the interfaces, and there are no obvious hardware or connectivity issues. However, we’re still seeing a high number of retransmissions and duplicate packets on the network. Let me know if you have any insights or suggestions. On Mon, Mar 3, 2025 at 12:36 Stefan Kooman <stefan@xxxxxx> wrote: > On 01-03-2025 15:10, Ramin Najjarbashi wrote: > > Hi > > We are currently facing severe latency issues in our Ceph cluster, > > particularly affecting read and write operations. At times, write > > operations completely stall, leading to significant service degradation. > > Below is a detailed breakdown of the issue, our observations, and the > > mitigation steps we have taken so far. We would greatly appreciate any > > insights or suggestions. > > What ceph version? > > How are OSDs provisioned (WAL+DB, single OSD, etc.). Type of disks. > > Gr. Stefan > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx