On 2018/03/28 1:39 pm, Subhachandra Chandra wrote:
We have seen similar behavior when there are network issues. AFAIK, the
OSD is being reported down by an OSD that cannot reach it. But either
another OSD that can reach it or the heartbeat between the OSD and the
monitor declares it up. The OSD "boot" message does not seem to
indicate an actual OSD restart.
Subhachandra
On Wed, Mar 28, 2018 at 10:30 AM, Andre Goree <andre@xxxxxxxxxx> wrote:
Hello,
I've recently had a minor issue come up where random individual OSDs
are failed due to a connection refused on another OSD. I say minor,
bc it's not a node-wide issue, and appears to be random nodes -- and
besides that, the OSD comes up within less than a second, as if the
OSD is sent a "restart," or something.
...
Great! Thank you! Yes I found it funny that it "restarted" so quickly,
and from my readings I remember that it takes more than a single OSD
heartbeat failing to produce and _actual_ failure, so as to prevent
false positives. Thanks for the insight!
--
Andre Goree
-=-=-=-=-=-
Email - andre at drenet.net
Website - http://blog.drenet.net
PGP key - http://www.drenet.net/pubkey.html
-=-=-=-=-=-
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com