Re: [PATCH] libceph: init the cursor when preparing the sparse read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 4, 2024 at 1:43 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
>
>
> On 3/2/24 01:15, Ilya Dryomov wrote:
> > On Fri, Mar 1, 2024 at 2:53 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >>
> >> On 2/29/24 18:48, Ilya Dryomov wrote:
> >>> On Thu, Feb 29, 2024 at 5:22 AM <xiubli@xxxxxxxxxx> wrote:
> >>>> From: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>>
> >>>> The osd code has remove cursor initilizing code and this will make
> >>>> the sparse read state into a infinite loop. We should initialize
> >>>> the cursor just before each sparse-read in messnger v2.
> >>>>
> >>>> Cc: stable@xxxxxxxxxxxxxxx
> >>>> URL: https://tracker.ceph.com/issues/64607
> >>>> Fixes: 8e46a2d068c9 ("libceph: just wait for more data to be available on the socket")
> >>>> Reported-by: Luis Henriques <lhenriques@xxxxxxx>
> >>>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>> ---
> >>>>    net/ceph/messenger_v2.c | 3 +++
> >>>>    1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
> >>>> index a0ca5414b333..7ae0f80100f4 100644
> >>>> --- a/net/ceph/messenger_v2.c
> >>>> +++ b/net/ceph/messenger_v2.c
> >>>> @@ -2025,6 +2025,7 @@ static int prepare_sparse_read_cont(struct ceph_connection *con)
> >>>>    static int prepare_sparse_read_data(struct ceph_connection *con)
> >>>>    {
> >>>>           struct ceph_msg *msg = con->in_msg;
> >>>> +       u64 len = con->in_msg->sparse_read_total ? : data_len(con->in_msg);
> >>>>
> >>>>           dout("%s: starting sparse read\n", __func__);
> >>>>
> >>>> @@ -2034,6 +2035,8 @@ static int prepare_sparse_read_data(struct ceph_connection *con)
> >>>>           if (!con_secure(con))
> >>>>                   con->in_data_crc = -1;
> >>>>
> >>>> +       ceph_msg_data_cursor_init(&con->v2.in_cursor, con->in_msg, len);
> >>>> +
> >>>>           reset_in_kvecs(con);
> >>>>           con->v2.in_state = IN_S_PREPARE_SPARSE_DATA_CONT;
> >>>>           con->v2.data_len_remain = data_len(msg);
> >>>> --
> >>>> 2.43.0
> >>>>
> >>> Hi Xiubo,
> >>>
> >>> How did this get missed?  Was generic/580 not paired with msgr2 in crc
> >>> mode or are we not running generic/580 at all?
> >>>
> >>> Multiple runs have happened since the patch was staged so if the matrix
> >>> is set up correctly ms_mode=crc should have been in effect for xfstests
> >>> at least a couple of times.
> >> I just found that my test script is incorrect and missed this case.
> >>
> >> The test locally is covered the msgr1 mostly and I think the qa test
> >> suite also doesn't cover it too. I will try to improve the qa tests later.
> > Could you please provide some details on the fixes needed to address
> > the coverage gap in the fs suite?
>
> Mainly to support the msgr v2 for fscrypt, before we only tested the
> fscrypt based on the msgr v1 for kclient. In ceph upstream we have
> support this while not backport it to reef yet.

I'm even more confused now...  If the fs suite in main covers msgr2 +
fscrypt (I'm taking your "in ceph upstream we have support" to mean
that), how did this bug get missed by runs on main?  At least a dozen
of them must have gone through in the form of Venky's integration
branches.

Thanks,

                Ilya





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux