> it's not Ceph but the network It's almost always the network ;-) Ramin: This reminds me of an outage we had at CERN caused by routing / ECMP / faulty line card. One of the main symptoms of that is high tcp retransmits on the Ceph nodes. Basically, OSDs keep many connections open with each other, with different src/dst port combinations. If your cluster has OSD hosts connected across routers, then you're likely using ECMP, and each connection src/dst ip/port combination takes a different path (different routers, different line cards). Then what happens is that if one line card is faulty -- which is often difficult to alert on -- some of the connections will work, but some will not. This is visible in the host retransmit counters, and it causes OSDs to flap up and down or other badness. One quick way to diagnose if this is the root cause here is to use netcat to try to connect between two ceph hosts using a range of source ports. E.g, assuming you can ssh from one OSD host to another, do this from one ceph host: echo {20000..20050} | xargs -t -n1 -I{} nc -z -p {} <other ceph osd> 22 If all your network paths are okay -- you'll get something like in the PS. If some paths are broken, you'll get errors! Hope that helps. -- dan bash-5.2$ echo {20000..20050} | xargs -t -n1 -I{} nc -z -p {} 192.168.1.248 22 nc -z -p 20000 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! nc -z -p 20001 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! nc -z -p 20002 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! nc -z -p 20003 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! nc -z -p 20004 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! nc -z -p 20005 192.168.1.248 22 Connection to 192.168.1.248 port 22 [tcp/ssh] succeeded! ... -- Dan van der Ster Ceph Executive Council | CTO @ CLYSO Try our Ceph Analyzer -- https://analyzer.clyso.com/ https://clyso.com | dan.vanderster@xxxxxxxxx On Tue, Mar 4, 2025 at 12:08 AM Eugen Block <eblock@xxxxxx> wrote: > > A few years ago, one of our customers complained about latency issues. > We investigated and the only real evidence we found were also high > retransmit values. So we recommended to let their network team look > into it. For months they refused to do anything, until they hired > another company to investigate the network. It was a network issue, > basically all cabling was replaced. I don't recall anymore if switches > and other components were replaced as well, but it definitely was > resolved after that. So if you ask me, I'd say it's not Ceph but the > network. ;-) > > Zitat von Ramin Najjarbashi <ramin.najarbashi@xxxxxxxxx>: > > > The Ceph version is 17.2.7. > > > > > > • OSDs are a mix of SSD and HDD, with DB/WAL colocated on the same OSDs. > > > > • SSDs are used for metadata and index pools with replication 3. > > > > • HDDs store the data pool using EC 4+2. > > > > > > Interestingly, the same issue has appeared on another cluster where DB/WAL > > is placed on NVMe disks, but the pool distribution is the same: meta and > > index on SSDs, and data on HDDs. > > > > > > It seems to be network-related, as I’ve checked the interfaces, and there > > are no obvious hardware or connectivity issues. However, we’re still seeing > > a high number of retransmissions and duplicate packets on the network. > > > > > > Let me know if you have any insights or suggestions. > > > > > > On Mon, Mar 3, 2025 at 12:36 Stefan Kooman <stefan@xxxxxx> wrote: > > > >> On 01-03-2025 15:10, Ramin Najjarbashi wrote: > >> > Hi > >> > We are currently facing severe latency issues in our Ceph cluster, > >> > particularly affecting read and write operations. At times, write > >> > operations completely stall, leading to significant service degradation. > >> > Below is a detailed breakdown of the issue, our observations, and the > >> > mitigation steps we have taken so far. We would greatly appreciate any > >> > insights or suggestions. > >> > >> What ceph version? > >> > >> How are OSDs provisioned (WAL+DB, single OSD, etc.). Type of disks. > >> > >> Gr. Stefan > >> > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx