Re: [PATCH] ceph: stop retrying the request when exceeding 256 times

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-03-30 at 14:44 +0800, xiubli@xxxxxxxxxx wrote:
> From: Xiubo Li <xiubli@xxxxxxxxxx>
> 
> The type of 'r_attempts' in kernel 'ceph_mds_request' is 'int',
> while in 'ceph_mds_request_head' the type of 'num_retry' is '__u8'.
> So in case the request retries exceeding 256 times, the MDS will
> receive a incorrect retry seq.
> 
> In this case it's ususally a bug in MDS and continue retrying the
> request makes no sense. For now let's limit it to 256. In future
> this could be fixed in ceph code, so avoid using the hardcode here.
> 
> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> ---
>  fs/ceph/mds_client.c | 25 +++++++++++++++++++++++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index e11d31401f12..f476c65fb985 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -2679,7 +2679,28 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  	struct ceph_mds_client *mdsc = session->s_mdsc;
>  	struct ceph_mds_request_head_old *rhead;
>  	struct ceph_msg *msg;
> -	int flags = 0;
> +	int flags = 0, max_retry;
> +
> +	/*
> +	 * The type of 'r_attempts' in kernel 'ceph_mds_request'
> +	 * is 'int', while in 'ceph_mds_request_head' the type of
> +	 * 'num_retry' is '__u8'. So in case the request retries
> +	 *  exceeding 256 times, the MDS will receive a incorrect
> +	 *  retry seq.
> +	 *
> +	 * In this case it's ususally a bug in MDS and continue
> +	 * retrying the request makes no sense.
> +	 *
> +	 * In future this could be fixed in ceph code, so avoid
> +	 * using the hardcode here.
> +	 */
> +	max_retry = sizeof_field(struct ceph_mds_request_head, num_retry);
> +	max_retry = 1 << (max_retry * BITS_PER_BYTE);
> +	if (req->r_attempts >= max_retry) {
> +		pr_warn_ratelimited("%s request tid %llu seq overflow\n",
> +				    __func__, req->r_tid);
> +		return -EMULTIHOP;
> +	}
>  
>  	req->r_attempts++;
>  	if (req->r_inode) {
> @@ -2691,7 +2712,7 @@ static int __prepare_send_request(struct ceph_mds_session *session,
>  		else
>  			req->r_sent_on_mseq = -1;
>  	}
> -	dout("prepare_send_request %p tid %lld %s (attempt %d)\n", req,
> +	dout("%s %p tid %lld %s (attempt %d)\n", __func__, req,
>  	     req->r_tid, ceph_mds_op_name(req->r_op), req->r_attempts);
>  
>  	if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux