On Tue 21-11-23 17:06:24, Liu Shixin wrote: > When spaces of swap devices are exhausted, only file pages can be > reclaimed. But there are still some swapcache pages in anon lru list. > This can lead to a premature out-of-memory. > > The problem is found with such step: > > Firstly, set a 9MB disk swap space, then create a cgroup with 10MB > memory limit, then runs an program to allocates about 15MB memory. > > The problem occurs occasionally, which may need about 100 times [1]. > > Fix it by checking number of swapcache pages in can_reclaim_anon_pages(). > If the number is not zero, return true and set swapcache_only to 1. > When scan anon lru list in swapcache_only mode, non-swapcache pages will > be skipped to isolate in order to accelerate reclaim efficiency. > > However, in swapcache_only mode, the scan count still increased when scan > non-swapcache pages because there are large number of non-swapcache pages > and rare swapcache pages in swapcache_only mode, and if the non-swapcache > is skipped and do not count, the scan of pages in isolate_lru_folios() can > eventually lead to hung task, just as Sachin reported [2]. I find this paragraph really confusing! I guess what you meant to say is that a real swapcache_only is problematic because it can end up not making any progress, correct? AFAIU you have addressed that problem by making swapcache_only anon LRU specific, right? That would be certainly more robust as you can still reclaim from file LRUs. I cannot say I like that because swapcache_only is a bit confusing and I do not think we want to grow more special purpose reclaim types. Would it be possible/reasonable to instead put swapcache pages on the file LRU instead? -- Michal Hocko SUSE Labs