On Mon, Apr 13, 2020 at 09:00:34PM +0800, Huang, Ying wrote: > Andrea Righi <andrea.righi@xxxxxxxxxxxxx> writes: > > [snip] > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > index ebed37bbf7a3..c71abc8df304 100644 > > --- a/mm/swap_state.c > > +++ b/mm/swap_state.c > > @@ -20,6 +20,7 @@ > > #include <linux/migrate.h> > > #include <linux/vmalloc.h> > > #include <linux/swap_slots.h> > > +#include <linux/oom.h> > > #include <linux/huge_mm.h> > > > > #include <asm/pgtable.h> > > @@ -507,6 +508,14 @@ static unsigned long swapin_nr_pages(unsigned long offset) > > max_pages = 1 << READ_ONCE(page_cluster); > > if (max_pages <= 1) > > return 1; > > + /* > > + * If current task is using too much memory or swapoff is running > > + * simply use the max readahead size. Since we likely want to load a > > + * lot of pages back into memory, using a fixed-size max readhaead can > > + * give better performance in this case. > > + */ > > + if (oom_task_origin(current)) > > + return max_pages; > > > > hits = atomic_xchg(&swapin_readahead_hits, 0); > > pages = __swapin_nr_pages(prev_offset, offset, hits, max_pages, > > Thinks this again. If my understanding were correct, the accessing > pattern during swapoff is sequential, why swap readahead doesn't work? > If so, can you root cause that firstly? Theoretically if the pattern is sequential the current heuristic should already select a big readahead size, but apparently it's not doing that. I'll repeat my tests tracing the readahead size during swapoff to see exactly what's going on here. Thanks, -Andrea