On Fri, Dec 8, 2023 at 1:24 AM Kairui Song <ryncsn@xxxxxxxxx> wrote: > > Yu Zhao <yuzhao@xxxxxxxxxx> 于2023年12月8日周五 14:14写道: > > > > Unmapped folios accessed through file descriptors can be > > underprotected. Those folios are added to the oldest generation based > > on: > > 1. The fact that they are less costly to reclaim (no need to walk the > > rmap and flush the TLB) and have less impact on performance (don't > > cause major PFs and can be non-blocking if needed again). > > 2. The observation that they are likely to be single-use. E.g., for > > client use cases like Android, its apps parse configuration files > > and store the data in heap (anon); for server use cases like MySQL, > > it reads from InnoDB files and holds the cached data for tables in > > buffer pools (anon). > > > > However, the oldest generation can be very short lived, and if so, it > > doesn't provide the PID controller with enough time to respond to a > > surge of refaults. (Note that the PID controller uses weighted > > refaults and those from evicted generations only take a half of the > > whole weight.) In other words, for a short lived generation, the > > moving average smooths out the spike quickly. > > > > To fix the problem: > > 1. For folios that are already on LRU, if they can be beyond the > > tracking range of tiers, i.e., five accesses through file > > descriptors, move them to the second oldest generation to give them > > more time to age. (Note that tiers are used by the PID controller > > to statistically determine whether folios accessed multiple times > > through file descriptors are worth protecting.) > > 2. When adding unmapped folios to LRU, adjust the placement of them so > > that they are not too close to the tail. The effect of this is > > similar to the above. > > > > On Android, launching 55 apps sequentially: > > Before After Change > > workingset_refault_anon 25641024 25598972 0% > > workingset_refault_file 115016834 106178438 -8% > > Hi Yu, > > Thanks you for your amazing works on MGLRU. > > I believe this is the similar issue I was trying to resolve previously: > https://lwn.net/Articles/945266/ > The idea is to use refault distance to decide if the page should be > place in oldest generation or some other gen, which per my test, > worked very well, and we have been using refault distance for MGLRU in > multiple workloads. > > There are a few issues left in my previous RFC series, like anon pages > in MGLRU shouldn't be considered, I wanted to collect feedback or test > cases, but unfortunately it seems didn't get too much attention > upstream. > > I think both this patch and my previous series are for solving the > file pages underpertected issue, and I did a quick test using this > series, for mongodb test, refault distance seems still a better > solution (I'm not saying these two optimization are mutually exclusive > though, just they do have some conflicts in implementation and solving > similar problem): > > Previous result: > ================================================================== > Execution Results after 905 seconds > ------------------------------------------------------------------ > Executed Time (µs) Rate > STOCK_LEVEL 2542 27121571486.2 0.09 txn/s > ------------------------------------------------------------------ > TOTAL 2542 27121571486.2 0.09 txn/s > > This patch: > ================================================================== > Execution Results after 900 seconds > ------------------------------------------------------------------ > Executed Time (µs) Rate > STOCK_LEVEL 1594 27061522574.4 0.06 txn/s > ------------------------------------------------------------------ > TOTAL 1594 27061522574.4 0.06 txn/s > > Unpatched version is always around ~500. Thanks for the test results! > I think there are a few points here: > - Refault distance make use of page shadow so it can better > distinguish evicted pages of different access pattern (re-access > distance). > - Throttled refault distance can help hold part of workingset when > memory is too small to hold the whole workingset. > > So maybe part of this patch and the bits of previous series can be > combined to work better on this issue, how do you think? I'll try to find some time this week to look at your RFC. It'd be a lot easier for me if you could share 1. your latest tree, preferably based on the mainline, and 2. your VM image containing the above test.