On Fri, Dec 8, 2023 at 5:08 PM <xiubli@xxxxxxxxxx> wrote: > > From: Xiubo Li <xiubli@xxxxxxxxxx> > > The messages from ceph maybe split into multiple socket packages > and we just need to wait for all the data to be availiable on the > sokcet. > > This will add a new _FINISH state for the sparse-read to mark the > current sparse-read succeeded. Else it will treat it as a new > sparse-read when the socket receives the last piece of the osd > request reply message, and the cancel_request() later will help > init the sparse-read context. > > URL: https://tracker.ceph.com/issues/63586 > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > --- > include/linux/ceph/osd_client.h | 1 + > net/ceph/osd_client.c | 6 +++--- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h > index 493de3496cd3..00d98e13100f 100644 > --- a/include/linux/ceph/osd_client.h > +++ b/include/linux/ceph/osd_client.h > @@ -47,6 +47,7 @@ enum ceph_sparse_read_state { > CEPH_SPARSE_READ_DATA_LEN, > CEPH_SPARSE_READ_DATA_PRE, > CEPH_SPARSE_READ_DATA, > + CEPH_SPARSE_READ_FINISH, > }; > > /* > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c > index 848ef19055a0..f1705b4f19eb 100644 > --- a/net/ceph/osd_client.c > +++ b/net/ceph/osd_client.c > @@ -5802,8 +5802,6 @@ static int prep_next_sparse_read(struct ceph_connection *con, > advance_cursor(cursor, sr->sr_req_len - end, false); > } > > - ceph_init_sparse_read(sr); > - > /* find next op in this request (if any) */ > while (++o->o_sparse_op_idx < req->r_num_ops) { > op = &req->r_ops[o->o_sparse_op_idx]; > @@ -5919,7 +5917,7 @@ static int osd_sparse_read(struct ceph_connection *con, > return -EREMOTEIO; > } > > - sr->sr_state = CEPH_SPARSE_READ_HDR; > + sr->sr_state = CEPH_SPARSE_READ_FINISH; > goto next_op; Hi Xiubo, This code appears to be set up to handle multiple (sparse-read) ops in a single OSD request. Wouldn't this break that case? I think the bug is in read_sparse_msg_data(). It shouldn't be calling con->ops->sparse_read() after the message has progressed to the footer. osd_sparse_read() is most likely fine as is. Notice how read_partial_msg_data() and read_partial_msg_data_bounce() behave: if called at that point (i.e. after the data section has been processed, meaning that cursor->total_resid == 0), they do nothing. read_sparse_msg_data() should have a similar condition and basically no-op itself. While at it, let's rename it to read_partial_sparse_msg_data() to emphasize the "partial"/no-op semantics that are required there. Thanks, Ilya