On Sat, Mar 28, 2020 at 5:18 AM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote: > > On Fri, Mar 27, 2020 at 11:10:40AM -0700, Yang Shi wrote: > > On Fri, Mar 27, 2020 at 10:06 AM Kirill A. Shutemov > > <kirill@xxxxxxxxxxxxx> wrote: > > > > > > __collapse_huge_page_isolate() may fail due to extra pin in the LRU add > > > pagevec. It's petty common for swapin case: we swap in pages just to > > > fail due to the extra pin. > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > > --- > > > mm/khugepaged.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 14d7afc90786..39e0994abeb8 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -585,11 +585,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > > * The page must only be referenced by the scanned process > > > * and page swap cache. > > > */ > > > + if (page_count(page) != 1 + PageSwapCache(page)) { > > > + /* > > > + * Drain pagevec and retry just in case we can get rid > > > + * of the extra pin, like in swapin case. > > > + */ > > > + lru_add_drain(); > > > > This is definitely correct. > > > > I'm wondering if we need one more lru_add_drain() before PageLRU check > > in khugepaged_scan_pmd() or not? The page might be in lru cache then > > get skipped. This would improve the success rate. > > Could you elaborate on the scenario, I don't follow. I mean the below change: --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1195,6 +1195,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, goto out_unmap; } khugepaged_node_load[node]++; + if (!PageLRU(page)) + /* Flush the page out of lru cache */ + lru_add_drain(); if (!PageLRU(page)) { result = SCAN_PAGE_LRU; goto out_unmap; If the page is not on LRU we even can't reach collapse_huge_page(), right? > > > -- > Kirill A. Shutemov