On Tue, 2022-03-29 at 19:12 +0800, Xiubo Li wrote: > On 3/29/22 5:53 PM, Luís Henriques wrote: > > xiubli@xxxxxxxxxx writes: > > > > > From: Xiubo Li <xiubli@xxxxxxxxxx> > > > > > > The type of 'num_fwd' in ceph 'MClientRequestForward' is 'int32_t', > > > while in 'ceph_mds_request_head' the type is '__u8'. So in case > > > the request bounces between MDSes exceeding 256 times, the client > > > will get stuck. > > > > > > In this case it's ususally a bug in MDS and continue bouncing the > > > request makes no sense. > > Ouch. Nice catch. This patch looks OK to me, just 2 minor comments > > bellow. > > > > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > > > --- > > > fs/ceph/mds_client.c | 31 ++++++++++++++++++++++++++++--- > > > 1 file changed, 28 insertions(+), 3 deletions(-) > > > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > index a89ee866ebbb..0bb6e7bc499c 100644 > > > --- a/fs/ceph/mds_client.c > > > +++ b/fs/ceph/mds_client.c > > > @@ -3293,6 +3293,7 @@ static void handle_forward(struct ceph_mds_client *mdsc, > > > int err = -EINVAL; > > > void *p = msg->front.iov_base; > > > void *end = p + msg->front.iov_len; > > > + bool aborted = false; > > > > > > ceph_decode_need(&p, end, 2*sizeof(u32), bad); > > > next_mds = ceph_decode_32(&p); > > > @@ -3309,8 +3310,28 @@ static void handle_forward(struct ceph_mds_client *mdsc, > > > dout("forward tid %llu aborted, unregistering\n", tid); > > > __unregister_request(mdsc, req); > > > } else if (fwd_seq <= req->r_num_fwd) { > > > - dout("forward tid %llu to mds%d - old seq %d <= %d\n", > > > - tid, next_mds, req->r_num_fwd, fwd_seq); > > > + /* > > > + * The type of 'num_fwd' in ceph 'MClientRequestForward' > > > + * is 'int32_t', while in 'ceph_mds_request_head' the > > > + * type is '__u8'. So in case the request bounces between > > > + * MDSes exceeding 256 times, the client will get stuck. > > > + * > > > + * In this case it's ususally a bug in MDS and continue > > > + * bouncing the request makes no sense. > > > + */ > > > + if (req->r_num_fwd == 256) { > > > + mutex_lock(&req->r_fill_mutex); > > > + req->r_err = -EIO; > > Not sure -EIO is the most appropriate. Maybe -E2BIG... although not quite > > it either. > > > Yeah, I also not very sure here. > > Jeff ? > Matching errors like this really comes down to a judgement call. E2BIG usually means that some buffer was sized too small, so you'll have users trying to figure out what they passed in wrong if you return that here. -EIO is the usual "default" when you don't know what else to use. There's also -EREMOTEIO which may be closer here since this is indicative of MDS problems. Given that, it may also be a good idea to log a pr_warn or pr_notice message at the same time explaining what happened. > > > > + set_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags); > > > + mutex_unlock(&req->r_fill_mutex); > > > + aborted = true; > > > + dout("forward tid %llu to mds%d - seq overflowed %d <= %d\n", > > > + tid, next_mds, req->r_num_fwd, fwd_seq); > > > + goto out; > > This 'goto' statement can be dropped, but one before (when the > > lookup_get_request() fails) needs to be adjusted, otherwise > > ceph_mdsc_put_request() may be called with a NULL pointer. > > Yeah, will fix it. > > Thanks. > > -- Xiubo > > > > Cheers, > -- Jeff Layton <jlayton@xxxxxxxxxx>