On Wed, 2022-03-30 at 14:44 +0800, xiubli@xxxxxxxxxx wrote: > From: Xiubo Li <xiubli@xxxxxxxxxx> > > The type of 'r_attempts' in kernel 'ceph_mds_request' is 'int', > while in 'ceph_mds_request_head' the type of 'num_retry' is '__u8'. > So in case the request retries exceeding 256 times, the MDS will > receive a incorrect retry seq. > > In this case it's ususally a bug in MDS and continue retrying the > request makes no sense. For now let's limit it to 256. In future > this could be fixed in ceph code, so avoid using the hardcode here. > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > --- > fs/ceph/mds_client.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index e11d31401f12..f476c65fb985 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -2679,7 +2679,28 @@ static int __prepare_send_request(struct ceph_mds_session *session, > struct ceph_mds_client *mdsc = session->s_mdsc; > struct ceph_mds_request_head_old *rhead; > struct ceph_msg *msg; > - int flags = 0; > + int flags = 0, max_retry; > + > + /* > + * The type of 'r_attempts' in kernel 'ceph_mds_request' > + * is 'int', while in 'ceph_mds_request_head' the type of > + * 'num_retry' is '__u8'. So in case the request retries > + * exceeding 256 times, the MDS will receive a incorrect > + * retry seq. > + * > + * In this case it's ususally a bug in MDS and continue > + * retrying the request makes no sense. > + * > + * In future this could be fixed in ceph code, so avoid > + * using the hardcode here. > + */ > + max_retry = sizeof_field(struct ceph_mds_request_head, num_retry); > + max_retry = 1 << (max_retry * BITS_PER_BYTE); > + if (req->r_attempts >= max_retry) { > + pr_warn_ratelimited("%s request tid %llu seq overflow\n", > + __func__, req->r_tid); > + return -EMULTIHOP; > + } > > req->r_attempts++; > if (req->r_inode) { > @@ -2691,7 +2712,7 @@ static int __prepare_send_request(struct ceph_mds_session *session, > else > req->r_sent_on_mseq = -1; > } > - dout("prepare_send_request %p tid %lld %s (attempt %d)\n", req, > + dout("%s %p tid %lld %s (attempt %d)\n", __func__, req, > req->r_tid, ceph_mds_op_name(req->r_op), req->r_attempts); > > if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) { Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>