Minchan Kim <minchan@xxxxxxxxxx> writes: > Hi Huang, > > On Tue, Oct 31, 2017 at 01:32:32PM +0800, Huang, Ying wrote: >> Hi, Minchan, >> >> Minchan Kim <minchan@xxxxxxxxxx> writes: >> >> > Hi Huang, >> > >> > On Fri, Oct 27, 2017 at 01:53:27PM +0800, Huang, Ying wrote: >> >> From: Huang Ying <huang.ying.caritas@xxxxxxxxx> >> >> >> >> When a page fault occurs for a swap entry, the physical swap readahead >> >> (not the VMA base swap readahead) may readahead several swap entries >> >> after the fault swap entry. The readahead algorithm calculates some >> >> of the swap entries to readahead via increasing the offset of the >> >> fault swap entry without checking whether they are beyond the end of >> >> the swap device and it relys on the __swp_swapcount() and >> >> swapcache_prepare() to check it. Although __swp_swapcount() checks >> >> for the swap entry passed in, it will complain with the error message >> >> as follow for the expected invalid swap entry. This may make the end >> >> users confused. >> >> >> >> swap_info_get: Bad swap offset entry 0200f8a7 >> >> >> >> To fix the false error message, the swap entry checking is added in >> >> swap readahead to avoid to pass the out-bound swap entries and the >> >> swap entry reserved for the swap header to __swp_swapcount() and >> >> swapcache_prepare(). >> >> >> >> Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> >> >> Cc: Minchan Kim <minchan@xxxxxxxxxx> >> >> Cc: Michal Hocko <mhocko@xxxxxxxx> >> >> Cc: <stable@xxxxxxxxxxxxxxx> # 4.11-4.13 >> >> Reported-by: Christian Kujau <lists@xxxxxxxxxxxxxxx> >> >> Fixes: e8c26ab60598 ("mm/swap: skip readahead for unreferenced swap slots") >> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >> >> --- >> >> include/linux/swap.h | 1 + >> >> mm/swap_state.c | 6 ++++-- >> >> mm/swapfile.c | 21 +++++++++++++++++++++ >> >> 3 files changed, 26 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/include/linux/swap.h b/include/linux/swap.h >> >> index 84255b3da7c1..43b4b821c805 100644 >> >> --- a/include/linux/swap.h >> >> +++ b/include/linux/swap.h >> >> @@ -476,6 +476,7 @@ extern int page_swapcount(struct page *); >> >> extern int __swap_count(struct swap_info_struct *si, swp_entry_t entry); >> >> extern int __swp_swapcount(swp_entry_t entry); >> >> extern int swp_swapcount(swp_entry_t entry); >> >> +extern bool swap_entry_check(swp_entry_t entry); >> >> extern struct swap_info_struct *page_swap_info(struct page *); >> >> extern struct swap_info_struct *swp_swap_info(swp_entry_t entry); >> >> extern bool reuse_swap_page(struct page *, int *); >> >> diff --git a/mm/swap_state.c b/mm/swap_state.c >> >> index 6c017ced11e6..7dd70e77058d 100644 >> >> --- a/mm/swap_state.c >> >> +++ b/mm/swap_state.c >> >> @@ -569,11 +569,13 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, >> >> /* Read a page_cluster sized and aligned cluster around offset. */ >> >> start_offset = offset & ~mask; >> >> end_offset = offset | mask; >> >> - if (!start_offset) /* First page is swap header. */ >> >> - start_offset++; >> >> >> >> blk_start_plug(&plug); >> >> for (offset = start_offset; offset <= end_offset ; offset++) { >> >> + swp_entry_t ent = swp_entry(swp_type(entry), offset); >> >> + >> >> + if (!swap_entry_check(ent)) >> >> + continue; >> >> /* Ok, do the async read-ahead now */ >> >> page = __read_swap_cache_async( >> >> swp_entry(swp_type(entry), offset), >> >> diff --git a/mm/swapfile.c b/mm/swapfile.c >> >> index 3074b02eaa09..b04cec29c234 100644 >> >> --- a/mm/swapfile.c >> >> +++ b/mm/swapfile.c >> >> @@ -1107,6 +1107,27 @@ static struct swap_info_struct *swap_info_get_cont(swp_entry_t entry, >> >> return p; >> >> } >> >> >> >> +bool swap_entry_check(swp_entry_t entry) >> >> +{ >> >> + struct swap_info_struct *p; >> >> + unsigned long offset, type; >> >> + >> >> + type = swp_type(entry); >> >> + if (type >= nr_swapfiles) >> >> + goto bad_file; >> >> + p = swap_info[type]; >> >> + offset = swp_offset(entry); >> >> + if (unlikely(!offset || offset >= p->max)) >> >> + goto out; >> >> + >> >> + return true; >> >> + >> >> +bad_file: >> >> + pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val); >> >> +out: >> >> + return false; >> >> +} >> >> + >> >> static unsigned char __swap_entry_free(struct swap_info_struct *p, >> >> swp_entry_t entry, unsigned char usage) >> >> { >> >> -- >> >> 2.14.2 >> > >> > Although it's better than old, we can make it simple, still. >> > >> > diff --git a/include/linux/swapops.h b/include/linux/swapops.h >> > index 291c4b534658..f50d5a48f03a 100644 >> > --- a/include/linux/swapops.h >> > +++ b/include/linux/swapops.h >> > @@ -41,6 +41,13 @@ static inline unsigned swp_type(swp_entry_t entry) >> > return (entry.val >> SWP_TYPE_SHIFT(entry)); >> > } >> > >> > +extern struct swap_info_struct *swap_info[]; >> > + >> > +static inline struct swap_info_struct *swp_si(swp_entry_t entry) >> > +{ >> > + return swap_info[swp_type(entry)]; >> > +} >> > + >> >> Why not just use swp_swap_info()? > > Yeap, I forgot the one that even I added the function. :( > >> >> > /* >> > * Extract the `offset' field from a swp_entry_t. The swp_entry_t is in >> > * arch-independent format >> > diff --git a/mm/swap_state.c b/mm/swap_state.c >> > index 378262d3a197..a0fe2d54ad09 100644 >> > --- a/mm/swap_state.c >> > +++ b/mm/swap_state.c >> > @@ -554,6 +554,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, >> > struct vm_area_struct *vma, unsigned long addr) >> > { >> > struct page *page; >> > + struct swap_info_struct *si = swp_si(entry); >> > unsigned long entry_offset = swp_offset(entry); >> > unsigned long offset = entry_offset; >> > unsigned long start_offset, end_offset; >> > @@ -572,6 +573,9 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, >> > if (!start_offset) /* First page is swap header. */ >> > start_offset++; >> > >> > + if (end_offset >= si->max) >> > + end_offset = si->max - 1; >> > + >> >> I don't think two implementation has much difference. Both look OK for >> me. > > The point is *readbility*. We always try to make code simple/understandable. > Even, it's more important for stable candidate. > > We don't need to check the validation in loop so reviewer don't need to > take care of it once it passes boundary check in above logic. > > As you say it's okay, please resend it. OK. Best Regards, Huang, Ying