Re: [PATCH v4] ceph: fix blindly expanding the readahead windows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/10/23 18:01, 胡玮文 wrote:
On Tue, May 09, 2023 at 08:57:03AM +0800, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

Blindly expanding the readahead windows will cause unneccessary
pagecache thrashing and also will introdue the network workload.
                                     ^^^^^^^^
                                     introduce

We should disable expanding the windows if the readahead is disabled
and also shouldn't expand the windows too much.

Expanding forward firstly instead of expanding backward for possible
sequential reads.

Bound `rreq->len` to the actual file size to restore the previous page
cache usage.

Cc: stable@xxxxxxxxxxxxxxx
Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead")
URL: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@xxxxxxxxxxxxxxxx
URL: https://www.spinics.net/lists/ceph-users/msg76183.html
Cc: Hu Weiwen <sehuww@xxxxxxxxxxxxxxxx>
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---

V4:
- two small cleanup from Ilya's comments. Thanks


  fs/ceph/addr.c | 28 +++++++++++++++++++++-------
  1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index ca4dc6450887..683ba9fbd590 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -188,16 +188,30 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
  	struct inode *inode = rreq->inode;
  	struct ceph_inode_info *ci = ceph_inode(inode);
  	struct ceph_file_layout *lo = &ci->i_layout;
+	unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
I think it is better to use `ractl->ra->ra_pages' instead of
`inode->i_sb->s_bdi->ra_pages'.  So that we can consider per-request ra
size config, e.g., posix_fadvise(POSIX_FADV_SEQUENTIAL) will double the
ra_pages.

Yeah, good catch.

But `ractl' is not passed to this function.  Can we just add this
argument?  ceph seems to be the only implementation of expand_readahead,
so it should be easy.  Or since this patch will be backported, maybe we
should keep it simple, and write another patch for this?

I think we should fix it together with this. It should be easy.

We can just store the "file->f_ra->ra_pages" in ceph_init_request() instead, because each rreq will be related to a dedicated file.


+	unsigned long max_len = max_pages << PAGE_SHIFT;
+	loff_t end = rreq->start + rreq->len, new_end;
  	u32 blockoff;
-	u64 blockno;
- /* Expand the start downward */
-	blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
-	rreq->start = blockno * lo->stripe_unit;
-	rreq->len += blockoff;
+	/* Readahead is disabled */
+	if (!max_pages)
+		return;
If we have access to ractl here, we can also skip expanding on
`ractl->file->f_mode & FMODE_RANDOM', which is set by
`posix_fadvise(POSIX_FADV_RANDOM)'.

Correct, because the "page_cache_sync_ra()' will do the right thing and we can skip expanding it here in ceph.

Thanks

- Xiubo


- /* Now, round up the length to the next block */
-	rreq->len = roundup(rreq->len, lo->stripe_unit);
+	/*
+	 * Try to expand the length forward by rounding  up it to the next
                                                        ^^ an extra space

+	 * block, but do not exceed the file size, unless the original
+	 * request already exceeds it.
+	 */
+	new_end = min(round_up(end, lo->stripe_unit), rreq->i_size);
+	if (new_end > end && new_end <= rreq->start + max_len)
+		rreq->len = new_end - rreq->start;
+
+	/* Try to expand the start downward */
+	div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
+	if (rreq->len + blockoff <= max_len) {
+		rreq->start -= blockoff;
+		rreq->len += blockoff;
+	}
  }
static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)
--
2.40.0





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux