Re: [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I agree that we should do BUG -> WARN on con->state everywhere.

But I don't think we should drop any of them, yet.  For Ugis's particular 
crash, it was fail_protocol()'s fault... see my other patch.  The rest of 
the time, a socket close should be caught at the top of con_work().

For any other cases where we see con->state changing when we don't expect 
it to, let's look at them on a case-by-case basis and address them in 
separate patches?

sage


On Thu, 27 Dec 2012, Alex Elder wrote:

> A number of assertions in the ceph messenger are implemented with
> BUG_ON(), killing the system if connection's state doesn't match
> what's expected.  At this point our state model is (evidently) not
> well understood enough for these assertions to trigger a BUG().
> Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
> so we learn about these issues without killing the machine.
> 
> We now recognize that a connection fault can occur due to a socket
> closure at any time, regardless of the state of the connection.  So
> there is really nothing we can assert about the state of the
> connection at that point so eliminate that assertion.
> 
> Reported-by: Ugis <ugis22@xxxxxxxxx>
> Tested-by: Ugis <ugis22@xxxxxxxxx>
> Signed-off-by: Alex Elder <elder@xxxxxxxxxxx>
> ---
>  net/ceph/messenger.c |   13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 4d111fd..075b9fd 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con,
>  	mutex_lock(&con->mutex);
>  	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
> 
> -	BUG_ON(con->state != CON_STATE_CLOSED);
> +	WARN_ON(con->state != CON_STATE_CLOSED);
>  	con->state = CON_STATE_PREOPEN;
> 
>  	con->peer_name.type = (__u8) entity_type;
> @@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con)
>  static void fail_protocol(struct ceph_connection *con)
>  {
>  	reset_connection(con);
> -	BUG_ON(con->state != CON_STATE_NEGOTIATING);
> +	WARN_ON(con->state != CON_STATE_NEGOTIATING);
>  	con->state = CON_STATE_CLOSED;
>  }
> 
> @@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection
> *con)
>  			return -1;
>  		}
> 
> -		BUG_ON(con->state != CON_STATE_NEGOTIATING);
> +		WARN_ON(con->state != CON_STATE_NEGOTIATING);
>  		con->state = CON_STATE_OPEN;
> 
>  		con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq);
> @@ -2132,7 +2132,6 @@ more:
>  		if (ret < 0)
>  			goto out;
> 
> -		BUG_ON(con->state != CON_STATE_CONNECTING);
>  		con->state = CON_STATE_NEGOTIATING;
> 
>  		/*
> @@ -2160,7 +2159,7 @@ more:
>  		goto more;
>  	}
> 
> -	BUG_ON(con->state != CON_STATE_OPEN);
> +	WARN_ON(con->state != CON_STATE_OPEN);
> 
>  	if (con->in_base_pos < 0) {
>  		/*
> @@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con)
>  	dout("fault %p state %lu to peer %s\n",
>  	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
> 
> -	BUG_ON(con->state != CON_STATE_CONNECTING &&
> -	       con->state != CON_STATE_NEGOTIATING &&
> -	       con->state != CON_STATE_OPEN);
> -
>  	con_close_socket(con);
> 
>  	if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) {
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux