Ditto, I had a bad optic on 48x10 switch. The only way I detected it was my prometheus tcp fail retrans count. Looking back over the previous 4 weeks, I could seen it increment in small bursts, but Ceph was able to handle it.... and then it went crazy and a bunch of OSD’s just dropped out. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx