On Wed, Jan 29, 2020 at 02:39:11PM +0100, Michal Hocko wrote: > On Wed 29-01-20 11:36:09, Chris Edwards wrote: > > Following the idea that it's some interaction with the X server, I further noticed that switching out of X to a text virtual console makes the page-outs stop. Going back to the X VT, the page-outs resume. > > > > I've attached another set of vmstat logs for the following timeline: > > > > 1580290800 System is running Xorg, stress to limit memory, and dd to exercise the buffer cache, with constant page-outs > > 1580290810 Switch to a text virtual console - page-outs stop > > 1580290830 Switch back to X VT - page-outs resume > > > > I'm vaguely suspecting something to do with the way Xorg handles old-fashioned programs that do CPU-driven bitmap-based rendering, as my desktop does typically have a lot of these (urxvt instances, xosview, the Notion window manager itself) - maybe they cause some particular pattern of memory churn in the X server, and perhaps only with certain video drivers...could Xorg perhaps wrongly madvise() the kernel about certain memory? It seems notable that having glxgears should cause the page-outs to stop. > > > > However, even a minimal X session with a sakura or qterminal window seems to show some degree of needless page-outs with low memory and busy cache, though not as severe - however, it's difficult to avoid observer effects! There did seem to be a notable pattern of increasing swap utilisation when switching away from the X VT, and a drop in swap utilisation when switching back to X. > > > > Should I perhaps take it up with the Xorg people instead? > > It is quite possible that those GUI applications are over allocating and > talking to Xorg people might give you some hints how to pursue debugging > in that direction. > > From the MM kernel POV it is still very interesting to find out why the > anonymous memory is evicted while there is a lot of clean page cache. > I didn't get to look at your recent vmstat data though and will vanish > on vacation during the weekend. Chris mentioned in a previous email that he's seeing stalls in the i915 driver. It could well be their shrinker doing direct writepage calls on the object pages, which would explain why changing the VM policy on anon/file had no impact on what Chris is seeing. The vmstat logs show pages moving around on the unevictable list, but there are no mlocked pages. That would also match i915 driver activity: they mark the shmem mapping unevictable to keep reclaim control over those pages inside the shrinker rather than the VM. Chris, could you trace the i915 shrinker? Enable the shrinker trace point: # echo 1 >/sys/kernel/debug/tracing/events/i915/i915_gem_shrink/enable Then watch for events while the swapping is occuring: # cat /sys/kernel/debug/tracing/trace