On Wed, Feb 26, 2020 at 07:39:42PM -0800, Andrew Morton wrote: > On Thu, 20 Feb 2020 14:11:44 +0900 js1304@xxxxxxxxx wrote: > > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > > > Hello, > > > > This patchset implements workingset protection and detection on > > the anonymous LRU list. > > The test robot measurement got my attention! > > http://lkml.kernel.org/r/20200227022905.GH6548@shao2-debian > > > * Changes on v2 > > - fix a critical bug that uses out of index lru list in > > workingset_refault() > > - fix a bug that reuses the rotate value for previous page > > > > * SUBJECT > > workingset protection > > > > * PROBLEM > > In current implementation, newly created or swap-in anonymous page is > > started on the active list. Growing the active list results in rebalancing > > active/inactive list so old pages on the active list are demoted to the > > inactive list. Hence, hot page on the active list isn't protected at all. > > > > Following is an example of this situation. > > > > Assume that 50 hot pages on active list and system can contain total > > 100 pages. Numbers denote the number of pages on active/inactive > > list (active | inactive). (h) stands for hot pages and (uo) stands for > > used-once pages. > > > > 1. 50 hot pages on active list > > 50(h) | 0 > > > > 2. workload: 50 newly created (used-once) pages > > 50(uo) | 50(h) > > > > 3. workload: another 50 newly created (used-once) pages > > 50(uo) | 50(uo), swap-out 50(h) > > > > As we can see, hot pages are swapped-out and it would cause swap-in later. > > > > * SOLUTION > > Since this is what we want to avoid, this patchset implements workingset > > protection. Like as the file LRU list, newly created or swap-in anonymous > > page is started on the inactive list. Also, like as the file LRU list, > > if enough reference happens, the page will be promoted. This simple > > modification changes the above example as following. > > One wonders why on earth we weren't doing these things in the first > place? > > > * SUBJECT > > workingset detection > > It sounds like the above simple aging changes provide most of the > improvement, and that the workingset changes are less beneficial and a > bit more risky/speculative? > > If so, would it be best for us to concentrate on the aging changes > first, let that settle in and spread out and then turn attention to the > workingset changes? Those two patches work well for some workloads (like the benchmark), but not for others. The full patchset makes sure both types work well. Specifically, the existing aging strategy for anon assumes that most anon pages allocated are hot. That's why they all start active and we then do second-chance with the small inactive LRU to filter out the few cold ones to swap out. This is true for many common workloads. The benchmark creates a larger-than-memory set of anon pages with a flat access profile - to the VM a flood of one-off pages. Joonsoo's first two patches allow the VM to usher those pages in and out of memory very quickly, which explains the throughput boost. But it comes at the cost of reducing space available to hot anon pages, which will regress others. Joonsoo's full patchset makes the VM support both types of workloads well: by putting everything on the inactive list first, one-off pages can move through the system without disturbing the hot pages. And by supplementing the inactive list with non-resident information, he can keep it tiny without the risk of one-off pages drowning out new hot pages. He can retain today's level of active page protection and detection, while allowing one-off pages to move through quickly.