Slow OSD heartbeats message

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I have flaky transceivers that sometimes lead to these messages:

Slow OSD heartbeats on front from osd.412 [CON-161-A1,ContainerSquare,Risoe] to osd.706 [CON-161-A1,ContainerSquare,Risoe] 12913.173 msec

After upgrade to octopus, this message now tries to be more helpful than before. Unfortunately, the one information that would really help is missing: the host bucket. Its the one piece I need, because its the hosts that have NICs. Is there any way to configure these messages such that they show the host bucket name?

I have close to no way of finding these transceivers. They often don't show as TX/RX errors and also don't show in the switch log. Neither are the devices marked down by the OS. If I'm fast enough I can bring NIC ports down one by one and see what the slow ping goes away. But for this, I really need to be able to identify the offending host fast. It would be great if this message could be improved in this way.

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux