sparse read extent limits (was: libceph: remove the max extents check for sparse read)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Moving from ceph-devel@vger to dev@ceph ]

Do we need to change the protocol so you can provide a limit on the
number of extents when you send a sparse read op?

I rather suspect the workloads generating this are the worst case
scenario and that we may want server-side limits in any case

On Mon, Nov 6, 2023 at 5:02 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
>
>
> On 11/6/23 20:32, Ilya Dryomov wrote:
> > On Mon, Nov 6, 2023 at 1:15 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >>
> >> On 11/6/23 19:38, Ilya Dryomov wrote:
> >>> On Mon, Nov 6, 2023 at 1:14 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >>>> On 11/3/23 18:07, Ilya Dryomov wrote:
> >>>>
> >>>> On Fri, Nov 3, 2023 at 4:41 AM <xiubli@xxxxxxxxxx> wrote:
> >>>>
> >>>> From: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>>
> >>>> There is no any limit for the extent array size and it's possible
> >>>> that when reading with a large size contents. Else the messager
> >>>> will fail by reseting the connection and keeps resending the inflight
> >>>> IOs.
> >>>>
> >>>> URL: https://tracker.ceph.com/issues/62081
> >>>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>> ---
> >>>>    net/ceph/osd_client.c | 12 ------------
> >>>>    1 file changed, 12 deletions(-)
> >>>>
> >>>> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> >>>> index 7af35106acaf..177a1d92c517 100644
> >>>> --- a/net/ceph/osd_client.c
> >>>> +++ b/net/ceph/osd_client.c
> >>>> @@ -5850,8 +5850,6 @@ static inline void convert_extent_map(struct ceph_sparse_read *sr)
> >>>>    }
> >>>>    #endif
> >>>>
> >>>> -#define MAX_EXTENTS 4096
> >>>> -
> >>>>    static int osd_sparse_read(struct ceph_connection *con,
> >>>>                              struct ceph_msg_data_cursor *cursor,
> >>>>                              char **pbuf)
> >>>> @@ -5882,16 +5880,6 @@ static int osd_sparse_read(struct ceph_connection *con,
> >>>>
> >>>>                   if (count > 0) {
> >>>>                           if (!sr->sr_extent || count > sr->sr_ext_len) {
> >>>> -                               /*
> >>>> -                                * Apply a hard cap to the number of extents.
> >>>> -                                * If we have more, assume something is wrong.
> >>>> -                                */
> >>>> -                               if (count > MAX_EXTENTS) {
> >>>> -                                       dout("%s: OSD returned 0x%x extents in a single reply!\n",
> >>>> -                                            __func__, count);
> >>>> -                                       return -EREMOTEIO;
> >>>> -                               }
> >>>> -
> >>>>                                   /* no extent array provided, or too short */
> >>>>                                   kfree(sr->sr_extent);
> >>>>                                   sr->sr_extent = kmalloc_array(count,
> >>>> --
> >>>> 2.39.1
> >>>>
> >>>> Hi Xiubo,
> >>>>
> >>>> As noted in the tracker ticket, there are many "sanity" limits like
> >>>> that in the messenger and other parts of the kernel client.  First,
> >>>> let's change that dout to pr_warn_ratelimited so that it's immediately
> >>>> clear what is going on.
> >>>>
> >>>> Sounds good to me.
> >>>>
> >>>>    Then, if the limit actually gets hit, let's
> >>>> dig into why and see if it can be increased rather than just removed.
> >>>>
> >>>> The test result in https://tracker.ceph.com/issues/62081#note-16 is that I just changed the 'len' to 5000 in ceph PR https://github.com/ceph/ceph/pull/54301 to emulate a random writes to a file and then tries to read with a large size:
> >>>>
> >>>> [ RUN      ] LibRadosIoPP.SparseReadExtentArrayOpPP
> >>>> extents array size : 4297
> >>>>
> >>>> For the 'extents array size' it could reach up to a very large number, then what it should be ? Any idea ?
> >>> Hi Xiubo,
> >>>
> >>> I don't think it can be a very large number in practice.
> >>>
> >>> CephFS uses sparse reads only in the fscrypt case, right?
> >> Yeah, it is.
> >>
> >>
> >>>     With
> >>> fscrypt, sub-4K writes to the object store aren't possible, right?
> >> Yeah.
> >>
> >>
> >>> If the answer to both is yes, then the maximum number of extents
> >>> would be
> >>>
> >>>       64M (CEPH_MSG_MAX_DATA_LEN) / 4K = 16384
> >>>
> >>> even if the object store does tracking at byte granularity.
> >> So yeah, just for the fscrypt case if we set the max extent number to
> >> 16384 it should be fine. But the sparse read could also be enabled in
> >> non-fscrypt case.
> > Ah, I see that it's also exposed as a mount option.  If we expect
> > people to use it, then dropping MAX_EXTENTS altogether might be the
> > best approach after all -- it doesn't make sense to warn about
> > something that we don't really control.
> >
> > How about printing the number of extents only if the allocation fails?
> > Something like:
> >
> >      if (!sr->sr_extent) {
> >              pr_err("failed to allocate %u sparse read extents\n", count);
> >              return -ENOMEM;
> >      }
>
> Yeah, this also looks good to me.
>
> I will fix it.
>
> Thanks
>
> - Xiubo
>
>
> > Thanks,
> >
> >                  Ilya
> >
>
>
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux