Hi Barry, Sorry for slow response. On Fri, Oct 18, 2024 at 06:12:01PM +1300, Barry Song wrote: > On Fri, Oct 18, 2024 at 6:58 AM Minchan Kim <minchan@xxxxxxxxxx> wrote: > > > > On Thu, Oct 17, 2024 at 06:59:09PM +1300, Barry Song wrote: > > > On Thu, Oct 17, 2024 at 11:58 AM Andrew Morton > > > <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, 16 Oct 2024 16:30:30 +1300 Barry Song <21cnbao@xxxxxxxxx> wrote: > > > > > > > > > To address this, this patch proposes maintaining a separate list > > > > > for lazyfree anon folios while keeping them classified under the > > > > > "file" LRU type to minimize code changes. > > > > > > > > Thanks. I'll await input from other MGLRU developers before adding > > > > this for testing. > > > > > > Thanks! > > > > > > Hi Minchan, Yu, > > > > > > Any comments? I understand that Minchan may have a broader plan > > > to "enable the system to maintain a quickly reclaimable memory > > > pool and provide a knob for admins to control its size." While I > > > have no objection to that plan, I believe improving MADV_FREE > > > performance is a more urgent priority and a low-hanging fruit at this > > > stage. > > > > Hi Barry, > > > > I have no idea why my email didn't send well before. I sent following > > reply on Sep 24. Hope it works this time. > > Hi Minchan, > > I guess not. Your *this* email ended up in my spam folder of gmail, and > my oppo.com account still hasn’t received it. Any idea why? In the end, that's my problem and don't know when it can be fixed. Anyway, hope again this time works. > > > > > ====== &< ====== > > > > My proposal involves the following: > > > > 1. Introduce an "easily reclaimable" LRU list. This list would hold pages > > that can be quickly freed without significant overhead. > > I assume you plan to keep both lazyfree anon pages and 'reclaimed' > file folios (reclaimed in the normal LRU lists but still in the easily- > reclaimable list) in this 'easily reclaimable' LRU list. However, I'm > not sure this will work, as this patch aims to help reclaim lazyfree > anon pages before file folios to reduce both file and anon refaults. > If we place 'reclaimed' file folios and lazyfree anon folios in the > same list, we may need to revisit how to reclaim lazyfree anon folios > before reclaiming the 'reclaimed' file folios. Those reclaimed folio was already *decision-made* but just couldn't due to the *impelementation issue*. So, that's strong candidate to be reclaimed as long as there is no access since then rather other candidates. > > > > > 2. Implement a parameter to control the size of this list. This allows for > > system tuning based on available memory and performance requirements. > > If we include only 'reclaimed' file folios in this 'easily > reclaimable' LRU list, the > parameter makes sense. However, if we also add lazyfree folios to the list, the > parameter becomes less meaningful since we can't predict how many > lazyfree anon folios user space might have. I still feel lazyfree anon folios > are different with "reclaimed" file folios (I mean reclaimed from normal > lists but still in 'easily-reclaimable' list). I thought the ez-reclamable LRU doesn't need to be accurate since we can put other folios later(e.g., fadvise_dontneed but couldn't at that time) > > > > > 3. Modify kswapd behavior to utilize this list. When kswapd is awakened due > > to memory pressure, it should attempt to drop those pages first to refill > > free pages up to the high watermark by first reclaiming. > > > > 4. Before kswapd goes to sleep, it should scan the tail of the LRU list and > > move cold pages to the easily reclaimable list, unmapping them from the > > page table. > > > > 5. Whenever page cache hit, move the page into evictable LRU. > > > > This approach allows the system to maintain a pool of readily available > > memory, mitigating the "aging" problem. The trade-off is the potential for > > minor page faults and LRU movement ovehreads if these pages in ez_reclaimable > > LRU are accessed again. > > I believe you're aware of an implementation from Samsung that uses > cleancache. Although it was dropped from the mainline kernel, it still > exists in the Android kernel. Samsung's rbincache, based on cleancache, > maintains a reserved memory region for holding reclaimed file folios. > Instead of LRU movement, rbincache uses memcpy to transfer data between > the pool and the page cache. > > > > > Furthermore, we could put some asynchrnous writeback pages(e.g., swap > > out or writeback the fs pages) into the list, too. > > Currently, what we are doing is rotate those pages back to head of LRU > > and once writeback is done, move the page to the tail of LRU again. > > We can simply put the page into ez_reclaimable LRU without rotating > > back and forth. > > If this is about establishing a pool of easily reclaimable file folios, I > fully support the idea and am eager to try it, especially for Android, > where there are certainly strong use cases. However, I suspect it may > be controversial and could take months to gain acceptance. Therefore, > I’d prefer we first focus on landing a smaller change to address the > madv_free performance issue and treat that idea as a separate > incremental patch set. I don't want to block the improvement, Barry. The reason I suggested another LRU was actullay to prevent divergent between MGLRU and split-LRU and show the same behavior introducing additional logic in the central place. I don't think that's desire that a usespace hint showed different priority depending on admin config. Personally, I belive that would be better to introudce a knob to change MADV_FREE's behavior for both LRU algorithms at the same time instead of only one even though we will see the LRU inversion issue. > > My current patch specifically targets the issue of reclaiming lazyfree > anon folios before reclaiming file folios. It appears your proposal is > independent (though related) work, and I don't believe it should delay > resolving the madv_free issue. Additionally, that pool doesn’t effectively > address the reclamation priority between files and lazyfree anon folios. > > In conclusion: > > 1. I agree that the pool is valuable, and I’d like to develop it as an > incremental patch set. However, this is a significant step that will > require considerable time. > 2. It could be quite tricky to include both lazyfree anon folios and > reclaimed file folios (which are reclaimed in normal lists but not in > the 'easily-reclaimable' list) in the same LRU list. I’d prefer to > start by replacing Samsung's rbincache to reduce file folio I/O if we > decide to implement the pool. > 3. I believe we should first focus on landing this fix patch for the > madv_free performance issue. > > What are your thoughts? I spoke with Yu, and he would like to hear > your opinion. Sure, I don't want to block any improvement but please think one more one more about my concern and just go with your ideas if everyone except me don't concern it. Thank you.