On 2024/11/29 7:08, Barry Song wrote: > On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@xxxxxxxxxx> wrote: >> >> >> >> On 2024/11/18 12:21, Matthew Wilcox wrote: >>> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote: >>>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: >>>>> >>>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote: >>>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited >>>>>> and added to swap cache folio by folio. After adding to swap cache, >>>>>> it will submit io to writeback folio to swap, which is asynchronous. >>>>>> When shrink_page_list is finished, the isolated folios list will be >>>>>> moved back to the head of inactive lru. The inactive lru may just look >>>>>> like this, with 512 filioes have been move to the head of inactive lru. >>>>> >>>>> I was hoping that we'd be able to stop splitting the folio when adding >>>>> to the swap cache. Ideally. we'd add the whole 2MB and write it back >>>>> as a single unit. >>>> >>>> This is already the case: adding to the swapcache doesn’t require splitting >>>> THPs, but failing to allocate 2MB of contiguous swap slots will. >>> >>> Agreed we need to understand why this is happening. As I've said a few >>> times now, we need to stop requiring contiguity. Real filesystems don't >>> need the contiguity (they become less efficient, but they can scatter a >>> single 2MB folio to multiple places). >>> >>> Maybe Chris has a solution to this in the works? >>> >> >> Hi, Chris, do you have a better idea to solve this issue? > > Not Chris. As I read the code again, we have already the below code to fixup > the issue "missed folio_rotate_reclaimable()" in evict_folios(): > > /* retry folios that may have missed > folio_rotate_reclaimable() */ > list_move(&folio->lru, &clean); > > It doesn't work for you? > > commit 359a5e1416caaf9ce28396a65ed3e386cc5de663 > Author: Yu Zhao <yuzhao@xxxxxxxxxx> > Date: Tue Nov 15 18:38:07 2022 -0700 > mm: multi-gen LRU: retry folios written back while isolated > > The page reclaim isolates a batch of folios from the tail of one of the > LRU lists and works on those folios one by one. For a suitable > swap-backed folio, if the swap device is async, it queues that folio for > writeback. After the page reclaim finishes an entire batch, it puts back > the folios it queued for writeback to the head of the original LRU list. > > In the meantime, the page writeback flushes the queued folios also by > batches. Its batching logic is independent from that of the page reclaim. > For each of the folios it writes back, the page writeback calls > folio_rotate_reclaimable() which tries to rotate a folio to the tail. > > > folio_rotate_reclaimable() only works for a folio after the page reclaim > has put it back. If an async swap device is fast enough, the page > writeback can finish with that folio while the page reclaim is still > working on the rest of the batch containing it. In this case, that folio > will remain at the head and the page reclaim will not retry it before > reaching there. > > This patch adds a retry to evict_folios(). After evict_folios() has > finished an entire batch and before it puts back folios it cannot free > immediately, it retries those that may have missed the rotation. > Before this patch, ~60% of folios swapped to an Intel Optane missed > folio_rotate_reclaimable(). After this patch, ~99% of missed folios were > reclaimed upon retry. > > This problem affects relatively slow async swap devices like Samsung 980 > Pro much less and does not affect sync swap devices like zram or zswap at > all. > >> >> Best regards, >> Ridong > > Thanks > Barry Thank you for your reply, Barry. I found this issue with 5.10 version. I reproduced this issue with the next version, but the CONFIG_LRU_GEN_ENABLED kconfig is disabled. I tested again with CONFIG_LRU_GEN_ENABLED enabled, and this issue can be fixed. IIUC, the 359a5e1416caaf9ce28396a65ed3e386cc5de663 commit can only work when CONFIG_LRU_GEN_ENABLED is enabled, but this issue exists when CONFIG_LRU_GEN_ENABLED is disabled and it should be fixed. I read the code of commit 359a5e1416caaf9ce28396a65ed3e386cc5de663, it found folios that are missed to rotate in a more complicated way, but it makes it much clearer what is being done. Should I implement in Yu Zhao's way? Best regards, Ridong