We exponentially back off when we encounter connection errors. If several errors accumulate, we will eventually wait ages before even trying to reconnect. Fix this by resetting the backoff counter after a successful negotiation/ connection with the remote node. Fixes ceph issue #2802. Signed-off-by: Sage Weil <sage@xxxxxxxxxxx> --- net/ceph/messenger.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 65964c2..28896eb 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1633,6 +1633,8 @@ static int process_connect(struct ceph_connection *con) if (con->in_reply.flags & CEPH_MSG_CONNECT_LOSSY) set_bit(LOSSYTX, &con->flags); + con->delay = 0; /* reset backoff memory */ + prepare_read_tag(con); break; -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html