Failing heartbeats when no backfill is running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear ceph-users,

I'm having trouble with heartbeats, there are a lot of "heartbeat_check:
no reply from..."-messages in my logs when there is no backfilling or
repairing running (yes, it's failing when all PGs are active+clean).
Only a few OSDs are failing, even when there are several OSDs on the
same host. Doesn't look like a network issue to me.

When I set the flags "nobackfill" and "norecover" there are no heartbeat
issues.

My cluster is kind of heterogenous: it's ARMv7 and x86_64, connected
mostly via VPN. Some hosts are Debian Stretch, so I'm still using Ceph
Luminous (12.2.12).

Is there someone having the same issue? What could be the next steps to
debug? Any ideas?

Thanks for any help!

Lorenz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux