On Tue, Jul 09, 2024 at 12:10:27PM -0700, Wengang Wang wrote: > Reading ahead take less lock on file compared to "unshare" the file via ioctl. > Do readahead when defrag sleeps for better defrag performace and thus more > file IO time. > > Signed-off-by: Wengang Wang <wen.gang.wang@xxxxxxxxxx> > --- > spaceman/defrag.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/spaceman/defrag.c b/spaceman/defrag.c > index 415fe9c2..ab8508bb 100644 > --- a/spaceman/defrag.c > +++ b/spaceman/defrag.c > @@ -331,6 +331,18 @@ defrag_fs_limit_hit(int fd) > } > > static bool g_enable_first_ext_share = true; > +static bool g_readahead = false; > + > +static void defrag_readahead(int defrag_fd, off64_t offset, size_t count) > +{ > + if (!g_readahead || g_idle_time <= 0) > + return; > + > + if (readahead(defrag_fd, offset, count) < 0) { > + fprintf(stderr, "readahead failed: %s, errno=%d\n", > + strerror(errno), errno); This doesn't do what you think it does. readahead() only queues the first readahead chunk of the range given (a few pages at most). It does not cause readahead on the entire range, wait for page cache population, nor report IO errors that might have occurred during readahead. There's almost no value to making this syscall, especially if the app is about to trigger a sequential read for the whole range. Readahead will occur naturally during that read operation (i.e. the UNSHARE copy), and the read will return IO errors unlike readahead(). If you want the page cache pre-populated before the unshare operation is done, then you need to use mmap() and madvise(MADV_POPULATE_READ). This will read the whole region into the page cache as if it was a sequential read, wait for it to complete and return any IO errors that might have occurred during the read. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx