We encounter a strange behavior on our Mimic 13.2.6 cluster. A any time, and without any load, some OSDs become unreachable from only some hosts. It last 10 mn and then the problem vanish. It 's not always the same OSDs and the same hosts. There is no network failure on any of the host (because only some OSDs become unreachable) nor disk freeze as we can see in our grafana dashboard. Logs message are : first msg : 2019-11-24 09:19:43.292 7fa9980fc700 -1 osd.596 146481 heartbeat_check: no reply from 192.168.6.112:6817 osd.394 since back 2019-11-24 09:19:22.761142 front 2019-11-24 09:19:39.769138 (cutoff 2019-11-24 09:19:23.293436) last msg: 2019-11-24 09:30:33.735 7f632354f700 -1 osd.591 146481 heartbeat_check: no reply from 192.168.6.123:6828 osd.600 since back 2019-11-24 09:27:05.269330 front 2019-11-24 09:30:33.214874 (cutoff 2019-11-24 09:30:13.736517) During this time, 3 hosts were involved : host-18, host-20 and host-30 : host-30 is the only one who can't see osds 346,356,and 352 on host-18 host-30 is the only one who can't see osds 387 and 394 on host-20 host-18 is the only one who can't see osds 583, 585, 591 and 597 on host-30 We can't see any strange behavior on hosts 18, 20 and 30 in our node exporter data during this time Any ideas or advices ? _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com