[patch added to the 3.12 stable tree] readahead: fix sequential read cache miss detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Damien Ramonda <damien.ramonda@xxxxxxxxx>

This patch has been added to the 3.12 stable tree. If you have any
objections, please let us know.

===============

commit af248a0c67457e5c6d2bcf288f07b4b2ed064f1f upstream.

The kernel's readahead algorithm sometimes interprets random read
accesses as sequential and triggers unnecessary data prefecthing from
storage device (impacting random read average latency).

In order to identify sequential cache read misses, the readahead
algorithm intends to check whether offset - previous offset == 1
(trivial sequential reads) or offset - previous offset == 0 (sequential
reads not aligned on page boundary):

  if (offset - (ra->prev_pos >> PAGE_CACHE_SHIFT) <= 1UL)

The current offset is stored in the "offset" variable of type "pgoff_t"
(unsigned long), while previous offset is stored in "ra->prev_pos" of
type "loff_t" (long long).  Therefore, operands of the if statement are
implicitly converted to type long long.  Consequently, when previous
offset > current offset (which happens on random pattern), the if
condition is true and access is wrongly interpeted as sequential.  An
unnecessary data prefetching is triggered, impacting the average random
read latency.

Storing the previous offset value in a "pgoff_t" variable (unsigned
long) fixes the sequential read detection logic.

Signed-off-by: Damien Ramonda <damien.ramonda@xxxxxxxxx>
Reviewed-by: Fengguang Wu <fengguang.wu@xxxxxxxxx>
Acked-by: Pierre Tardy <pierre.tardy@xxxxxxxxx>
Acked-by: David Cohen <david.a.cohen@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
Signed-off-by: Jiri Slaby <jslaby@xxxxxxx>
---
 mm/readahead.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index e4ed04149785..5b637b593674 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -401,6 +401,7 @@ ondemand_readahead(struct address_space *mapping,
 		   unsigned long req_size)
 {
 	unsigned long max = max_sane_readahead(ra->ra_pages);
+	pgoff_t prev_offset;
 
 	/*
 	 * start of file
@@ -452,8 +453,11 @@ ondemand_readahead(struct address_space *mapping,
 
 	/*
 	 * sequential cache miss
+	 * trivial case: (offset - prev_offset) == 1
+	 * unaligned reads: (offset - prev_offset) == 0
 	 */
-	if (offset - (ra->prev_pos >> PAGE_CACHE_SHIFT) <= 1UL)
+	prev_offset = (unsigned long long)ra->prev_pos >> PAGE_CACHE_SHIFT;
+	if (offset - prev_offset <= 1UL)
 		goto initial_readahead;
 
 	/*
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]