Patch "readahead: avoid multiple marked readahead pages" has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    readahead: avoid multiple marked readahead pages

to the 6.1-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     readahead-avoid-multiple-marked-readahead-pages.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 751dd31cb25b1fda2357852e790cffcc04bb0544
Author: Jan Kara <jack@xxxxxxx>
Date:   Thu Jan 4 09:58:39 2024 +0100

    readahead: avoid multiple marked readahead pages
    
    [ Upstream commit ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ]
    
    ra_alloc_folio() marks a page that should trigger next round of async
    readahead.  However it rounds up computed index to the order of page being
    allocated.  This can however lead to multiple consecutive pages being
    marked with readahead flag.  Consider situation with index == 1, mark ==
    1, order == 0.  We insert order 0 page at index 1 and mark it.  Then we
    bump order to 1, index to 2, mark (still == 1) is rounded up to 2 so page
    at index 2 is marked as well.  Then we bump order to 2, index is
    incremented to 4, mark gets rounded to 4 so page at index 4 is marked as
    well.  The fact that multiple pages get marked within a single readahead
    window confuses the readahead logic and results in readahead window being
    trimmed back to 1.  This situation is triggered in particular when maximum
    readahead window size is not a power of two (in the observed case it was
    768 KB) and as a result sequential read throughput suffers.
    
    Fix the problem by rounding 'mark' down instead of up.  Because the index
    is naturally aligned to 'order', we are guaranteed 'rounded mark' == index
    iff 'mark' is within the page we are allocating at 'index' and thus
    exactly one page is marked with readahead flag as required by the
    readahead code and sequential read performance is restored.
    
    This effectively reverts part of commit b9ff43dd2743 ("mm/readahead: Fix
    readahead with large folios").  The commit changed the rounding with the
    rationale:
    
    "...  we were setting the readahead flag on the folio which contains the
    last byte read from the block.  This is wrong because we will trigger
    readahead at the end of the read without waiting to see if a subsequent
    read is going to use the pages we just read."
    
    Although this is true, the fact is this was always the case with read
    sizes not aligned to folio boundaries and large folios in the page cache
    just make the situation more obvious (and frequent).  Also for sequential
    read workloads it is better to trigger the readahead earlier rather than
    later.  It is true that the difference in the rounding and thus earlier
    triggering of the readahead can result in reading more for semi-random
    workloads.  However workloads really suffering from this seem to be rare.
    In particular I have verified that the workload described in commit
    b9ff43dd2743 ("mm/readahead: Fix readahead with large folios") of reading
    random 100k blocks from a file like:
    
    [reader]
    bs=100k
    rw=randread
    numjobs=1
    size=64g
    runtime=60s
    
    is not impacted by the rounding change and achieves ~70MB/s in both cases.
    
    [jack@xxxxxxx: fix one more place where mark rounding was done as well]
      Link: https://lkml.kernel.org/r/20240123153254.5206-1-jack@xxxxxxx
    Link: https://lkml.kernel.org/r/20240104085839.21029-1-jack@xxxxxxx
    Fixes: b9ff43dd2743 ("mm/readahead: Fix readahead with large folios")
    Signed-off-by: Jan Kara <jack@xxxxxxx>
    Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
    Cc: Guo Xuenan <guoxuenan@xxxxxxxxxx>
    Cc: <stable@xxxxxxxxxxxxxxx>
    Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/mm/readahead.c b/mm/readahead.c
index ba43428043a35..e4b772bb70e68 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -483,7 +483,7 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,
 
 	if (!folio)
 		return -ENOMEM;
-	mark = round_up(mark, 1UL << order);
+	mark = round_down(mark, 1UL << order);
 	if (index == mark)
 		folio_set_readahead(folio);
 	err = filemap_add_folio(ractl->mapping, folio, index, gfp);
@@ -591,7 +591,7 @@ static void ondemand_readahead(struct readahead_control *ractl,
 	 * It's the expected callback index, assume sequential access.
 	 * Ramp up sizes, and push forward the readahead window.
 	 */
-	expected = round_up(ra->start + ra->size - ra->async_size,
+	expected = round_down(ra->start + ra->size - ra->async_size,
 			1UL << order);
 	if (index == expected || index == (ra->start + ra->size)) {
 		ra->start += ra->size;




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux