On Mon, Nov 6, 2023 at 1:15 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > > > On 11/6/23 19:38, Ilya Dryomov wrote: > > On Mon, Nov 6, 2023 at 1:14 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > >> > >> On 11/3/23 18:07, Ilya Dryomov wrote: > >> > >> On Fri, Nov 3, 2023 at 4:41 AM <xiubli@xxxxxxxxxx> wrote: > >> > >> From: Xiubo Li <xiubli@xxxxxxxxxx> > >> > >> There is no any limit for the extent array size and it's possible > >> that when reading with a large size contents. Else the messager > >> will fail by reseting the connection and keeps resending the inflight > >> IOs. > >> > >> URL: https://tracker.ceph.com/issues/62081 > >> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > >> --- > >> net/ceph/osd_client.c | 12 ------------ > >> 1 file changed, 12 deletions(-) > >> > >> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c > >> index 7af35106acaf..177a1d92c517 100644 > >> --- a/net/ceph/osd_client.c > >> +++ b/net/ceph/osd_client.c > >> @@ -5850,8 +5850,6 @@ static inline void convert_extent_map(struct ceph_sparse_read *sr) > >> } > >> #endif > >> > >> -#define MAX_EXTENTS 4096 > >> - > >> static int osd_sparse_read(struct ceph_connection *con, > >> struct ceph_msg_data_cursor *cursor, > >> char **pbuf) > >> @@ -5882,16 +5880,6 @@ static int osd_sparse_read(struct ceph_connection *con, > >> > >> if (count > 0) { > >> if (!sr->sr_extent || count > sr->sr_ext_len) { > >> - /* > >> - * Apply a hard cap to the number of extents. > >> - * If we have more, assume something is wrong. > >> - */ > >> - if (count > MAX_EXTENTS) { > >> - dout("%s: OSD returned 0x%x extents in a single reply!\n", > >> - __func__, count); > >> - return -EREMOTEIO; > >> - } > >> - > >> /* no extent array provided, or too short */ > >> kfree(sr->sr_extent); > >> sr->sr_extent = kmalloc_array(count, > >> -- > >> 2.39.1 > >> > >> Hi Xiubo, > >> > >> As noted in the tracker ticket, there are many "sanity" limits like > >> that in the messenger and other parts of the kernel client. First, > >> let's change that dout to pr_warn_ratelimited so that it's immediately > >> clear what is going on. > >> > >> Sounds good to me. > >> > >> Then, if the limit actually gets hit, let's > >> dig into why and see if it can be increased rather than just removed. > >> > >> The test result in https://tracker.ceph.com/issues/62081#note-16 is that I just changed the 'len' to 5000 in ceph PR https://github.com/ceph/ceph/pull/54301 to emulate a random writes to a file and then tries to read with a large size: > >> > >> [ RUN ] LibRadosIoPP.SparseReadExtentArrayOpPP > >> extents array size : 4297 > >> > >> For the 'extents array size' it could reach up to a very large number, then what it should be ? Any idea ? > > Hi Xiubo, > > > > I don't think it can be a very large number in practice. > > > > CephFS uses sparse reads only in the fscrypt case, right? > > Yeah, it is. > > > > With > > fscrypt, sub-4K writes to the object store aren't possible, right? > > Yeah. > > > > If the answer to both is yes, then the maximum number of extents > > would be > > > > 64M (CEPH_MSG_MAX_DATA_LEN) / 4K = 16384 > > > > even if the object store does tracking at byte granularity. > So yeah, just for the fscrypt case if we set the max extent number to > 16384 it should be fine. But the sparse read could also be enabled in > non-fscrypt case. Ah, I see that it's also exposed as a mount option. If we expect people to use it, then dropping MAX_EXTENTS altogether might be the best approach after all -- it doesn't make sense to warn about something that we don't really control. How about printing the number of extents only if the allocation fails? Something like: if (!sr->sr_extent) { pr_err("failed to allocate %u sparse read extents\n", count); return -ENOMEM; } Thanks, Ilya