On Mon, Nov 27, 2023 at 8:05 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Yosry Ahmed <yosryahmed@xxxxxxxxxx> writes: > > > On Mon, Nov 27, 2023 at 7:21 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> > >> Yosry Ahmed <yosryahmed@xxxxxxxxxx> writes: > >> > >> > On Mon, Nov 27, 2023 at 1:32 PM Minchan Kim <minchan@xxxxxxxxxx> wrote: > >> >> > >> >> On Mon, Nov 27, 2023 at 12:22:59AM -0800, Chris Li wrote: > >> >> > On Mon, Nov 27, 2023 at 12:14 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> >> > > > I agree with Ying that anonymous pages typically have different page > >> >> > > > access patterns than file pages, so we might want to treat them > >> >> > > > differently to reclaim them effectively. > >> >> > > > One random idea: > >> >> > > > How about we put the anonymous page in a swap cache in a different LRU > >> >> > > > than the rest of the anonymous pages. Then shrinking against those > >> >> > > > pages in the swap cache would be more effective.Instead of having > >> >> > > > [anon, file] LRU, now we have [anon not in swap cache, anon in swap > >> >> > > > cache, file] LRU > >> >> > > > >> >> > > I don't think that it is necessary. The patch is only for a special use > >> >> > > case. Where the swap device is used up while some pages are in swap > >> >> > > cache. The patch will kill performance, but it is used to avoid OOM > >> >> > > only, not to improve performance. Per my understanding, we will not use > >> >> > > up swap device space in most cases. This may be true for ZRAM, but will > >> >> > > we keep pages in swap cache for long when we use ZRAM? > >> >> > > >> >> > I ask the question regarding how many pages can be freed by this patch > >> >> > in this email thread as well, but haven't got the answer from the > >> >> > author yet. That is one important aspect to evaluate how valuable is > >> >> > that patch. > >> >> > >> >> Exactly. Since swap cache has different life time with page cache, they > >> >> would be usually dropped when pages are unmapped(unless they are shared > >> >> with others but anon is usually exclusive private) so I wonder how much > >> >> memory we can save. > >> > > >> > I think the point of this patch is not saving memory, but rather > >> > avoiding an OOM condition that will happen if we have no swap space > >> > left, but some pages left in the swap cache. Of course, the OOM > >> > avoidance will come at the cost of extra work in reclaim to swap those > >> > pages out. > >> > > >> > The only case where I think this might be harmful is if there's plenty > >> > of pages to reclaim on the file LRU, and instead we opt to chase down > >> > the few swap cache pages. So perhaps we can add a check to only set > >> > sc->swapcache_only if the number of pages in the swap cache is more > >> > than the number of pages on the file LRU or similar? Just make sure we > >> > don't chase the swapcache pages down if there's plenty to scan on the > >> > file LRU? > >> > >> The swap cache pages can be divided to 3 groups. > >> > >> - group 1: pages have been written out, at the tail of inactive LRU, but > >> not reclaimed yet. > >> > >> - group 2: pages have been written out, but were failed to be reclaimed > >> (e.g., were accessed before reclaiming) > >> > >> - group 3: pages have been swapped in, but were kept in swap cache. The > >> pages may be in active LRU. > >> > >> The main target of the original patch should be group 1. And the pages > >> may be cheaper to reclaim than file pages. > >> > >> Group 2 are hard to be reclaimed if swap_count() isn't 0. > >> > >> Group 3 should be reclaimed in theory, but the overhead may be high. > >> And we may need to reclaim the swap entries instead of pages if the pages > >> are hot. But we can start to reclaim the swap entries before the swap > >> space is run out. > >> > >> So, if we can count group 1, we may use that as indicator to scan anon > >> pages. And we may add code to reclaim group 3 earlier. > >> > > > > My point was not that reclaiming the pages in the swap cache is more > > expensive that reclaiming the pages in the file LRU. In a lot of > > cases, as you point out, the pages in the swap cache can just be > > dropped, so they may be as cheap or cheaper to reclaim than the pages > > in the file LRU. > > > > My point was that scanning the anon LRU when swap space is exhausted > > to get to the pages in the swap cache may be much more expensive, > > because there may be a lot of pages on the anon LRU that are not in > > the swap cache, and hence are not reclaimable, unlike pages in the > > file LRU, which should mostly be reclaimable. > > > > So what I am saying is that maybe we should not do the effort of > > scanning the anon LRU in the swapcache_only case unless there aren't a > > lot of pages to reclaim on the file LRU (relatively). For example, if > > we have a 100 pages in the swap cache out of 10000 pages in the anon > > LRU, and there are 10000 pages in the file LRU, it's probably not > > worth scanning the anon LRU. > > For group 1 pages, they are at the tail of the anon inactive LRU, so the > scan overhead is low too. For example, if number of group 1 pages is > 100, we just need to scan 100 pages to reclaim them. We can choose to > stop scanning when the number of the non-group-1 pages reached some > threshold. > We should still try to reclaim pages in groups 2 & 3 before OOMing though. Maybe the motivation for this patch is group 1, but I don't see why we should special case them. Pages in groups 2 & 3 should be roughly equally cheap to reclaim. They may have higher refault cost, but IIUC we should still try to reclaim them before OOMing.