On Mon, Mar 4, 2024 at 3:00 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > > > On 3/4/24 19:02, Ilya Dryomov wrote: > > On Mon, Mar 4, 2024 at 2:02 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > > On 3/2/24 00:16, Ilya Dryomov wrote: > > On Thu, Feb 29, 2024 at 5:22 AM <xiubli@xxxxxxxxxx> wrote: > > From: Xiubo Li <xiubli@xxxxxxxxxx> > > The osd code has remove cursor initilizing code and this will make > the sparse read state into a infinite loop. We should initialize > the cursor just before each sparse-read in messnger v2. > > Cc: stable@xxxxxxxxxxxxxxx > URL: https://tracker.ceph.com/issues/64607 > Fixes: 8e46a2d068c9 ("libceph: just wait for more data to be available on the socket") > Reported-by: Luis Henriques <lhenriques@xxxxxxx> > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > --- > net/ceph/messenger_v2.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c > index a0ca5414b333..7ae0f80100f4 100644 > --- a/net/ceph/messenger_v2.c > +++ b/net/ceph/messenger_v2.c > @@ -2025,6 +2025,7 @@ static int prepare_sparse_read_cont(struct ceph_connection *con) > static int prepare_sparse_read_data(struct ceph_connection *con) > { > struct ceph_msg *msg = con->in_msg; > + u64 len = con->in_msg->sparse_read_total ? : data_len(con->in_msg); > > Hi Xiubo, > > Why is sparse_read_total being tested here? Isn't this function > supposed to be called only for sparse reads, after the state is set to > IN_S_PREPARE_SPARSE_DATA based on a similar test: > > if (msg->sparse_read_total) > con->v2.in_state = IN_S_PREPARE_SPARSE_DATA; > else > con->v2.in_state = IN_S_PREPARE_READ_DATA; > > Then the patch could be simplified and just be: > > diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c > index a0ca5414b333..ab3ab130a911 100644 > --- a/net/ceph/messenger_v2.c > +++ b/net/ceph/messenger_v2.c > @@ -2034,6 +2034,9 @@ static int prepare_sparse_read_data(struct > ceph_connection *con) > if (!con_secure(con)) > con->in_data_crc = -1; > > + ceph_msg_data_cursor_init(&con->v2.in_cursor, con->in_msg, > + con->in_msg->sparse_read_total); > + > reset_in_kvecs(con); > con->v2.in_state = IN_S_PREPARE_SPARSE_DATA_CONT; > con->v2.data_len_remain = data_len(msg); > > Else where should we do the test like this ? > > Hi Xiubo, > > I suspect you copied this test from prepare_message_data() in msgr1, > where that function is called unconditionally for all reads. In msgr2, > prepare_sparse_read_data() is called conditionally, so the test just > seems bogus. > > That said, CephFS is the only user of sparse read code, so you should > know better at this point ;) > > As we know the 'sparse_read_total' it's a dedicated member and will be set only in sparse-read case. > > In msgr1 for all the read cases they will call "prepare_message_data()", so I just did this check in "prepare_message_data()". Right. > > While for msgr2 it's a little different from msg1 and it won't call 'prepare_read_data()' for all the reads, and for sparse-read it has its own dedicated helper, which is "prepare_sparse_read_data()". Right. > So I just did this check in 'prepare_sparse_read_data()' instead. This where I'm getting lost. Why do a "is this a sparse read" check in a helper that is dedicated to sparse reads and isn't called for anything else? > For "prepare_read_data()" it doesn't make any sense to check "sparse_read_total". Right, so why doesn't the same go for prepare_sparse_read_data()? Thanks, Ilya