On Tue, Aug 06, 2019 at 12:51:49PM +0200, Michal Hocko wrote: > On Tue 06-08-19 06:45:54, Joel Fernandes wrote: > > On Tue, Aug 06, 2019 at 10:43:57AM +0200, Michal Hocko wrote: > > > On Mon 05-08-19 13:04:50, Joel Fernandes (Google) wrote: > > > > During idle tracking, we see that sometimes faulted anon pages are in > > > > pagevec but are not drained to LRU. Idle tracking considers pages only > > > > on LRU. Drain all CPU's LRU before starting idle tracking. > > > > > > Please expand on why does this matter enough to introduce a potentially > > > expensinve draining which has to schedule a work on each CPU and wait > > > for them to finish. > > > > Sure, I can expand. I am able to find multiple issues involving this. One > > issue looks like idle tracking is completely broken. It shows up in my > > testing as if a page that is marked as idle is always "accessed" -- because > > it was never marked as idle (due to not draining of pagevec). > > > > The other issue shows up as a failure in my "swap test", with the following > > sequence: > > 1. Allocate some pages > > 2. Write to them > > 3. Mark them as idle <--- fails > > 4. Introduce some memory pressure to induce swapping. > > 5. Check the swap bit I introduced in this series. <--- fails to set idle > > bit in swap PTE. > > > > Draining the pagevec in advance fixes both of these issues. > > This belongs to the changelog. Sure, will add. > > This operation even if expensive is only done once during the access of the > > page_idle file. Did you have a better fix in mind? > > Can we set the idle bit also for non-lru pages as long as they are > reachable via pte? Not at the moment with the current page idle tracking code. PageLRU(page) flag is checked in page_idle_get_page(). Even if we could set it for non-LRU, the idle bit (page flag) would not be cleared if page is not on LRU because page-reclaim code (page_referenced() I believe) would not clear it. This whole mechanism depends on page-reclaim. Or did I miss your point? thanks, - Joel