On Wed, May 10, 2017 at 08:13:12AM +0200, Michal Hocko wrote: > On Wed 10-05-17 10:46:54, Minchan Kim wrote: > > On Wed, May 03, 2017 at 08:00:44AM +0200, Michal Hocko wrote: > [...] > > > @@ -1486,6 +1486,12 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, > > > continue; > > > } > > > > > > + /* > > > + * Do not count skipped pages because we do want to isolate > > > + * some pages even when the LRU mostly contains ineligible > > > + * pages > > > + */ > > > > How about adding comment about "why"? > > > > /* > > * Do not count skipped pages because it makes the function to return with > > * none isolated pages if the LRU mostly contains inelgible pages so that > > * VM cannot reclaim any pages and trigger premature OOM. > > */ > > I am not sure this is necessarily any better. Mentioning a pre-mature > OOM would require a much better explanation because a first immediate > question would be "why don't we scan those pages at priority 0". Also > decision about the OOM is at a different layer and it might change in > future when this doesn't apply any more. But it is not like I would > insist... > > > > + scan++; > > > switch (__isolate_lru_page(page, mode)) { > > > case 0: > > > nr_pages = hpage_nr_pages(page); > > > > Confirmed. > > Hmm. I can clearly see how we could skip over too many pages and hit > small reclaim priorities too quickly but I am still scratching my head > about how we could hit the OOM killer as a result. The amount of pages > on the active anonymous list suggests that we are not able to rotate > pages quickly enough. I have to keep thinking about that. I explained it but seems to be not enouggh. Let me try again. The problem is that get_scan_count determines nr_to_scan with eligible zones. size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx); size = size >> sc->priority; Assumes sc->priority is 0 and LRU list is as follows. N-N-N-N-H-H-H-H-H-H-H-H-H-H-H-H-H-H-H-H (Ie, small eligible pages are in the head of LRU but others are almost ineligible pages) In that case, size becomes 4 so VM want to scan 4 pages but 4 pages from tail of the LRU are not eligible pages. If get_scan_count counts skipped pages, it doesn't reclaim remained pages after scanning 4 pages. If it's more helpful to understand the problem, I will add it to the description. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>