> > Hmm originally I was thinking of doing an (unconditional) > lru_add_drain() outside of zswap_writeback_entry() - once in > shrink_worker() and/or zswap_shrinker_scan(), before we write back any > of the entries. Not sure if it would work/help here tho - haven't > tested that idea yet. > The pages are allocated by __read_swap_cache_async() in zswap_writeback_entry() and it must be newly allocated, not cached in swap. Please see the code below in zswap_writeback_entry() page = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol, NO_INTERLEAVE_INDEX, &page_was_allocated); if (!page) { goto fail;} /* Found an existing page, we raced with load/swapin */ if (!page_was_allocated) { put_page(page); ret = -EEXIST; goto fail; } So when it comes to SetPageReclaim(page), The page has just been allocated and is still in the percpu batch, which has not been added to the LRU. Therefore,lru_add_drain() did not work outside the zswap_writeback_entry() > > > > New test: > > This patch will add the execution of folio_rotate_reclaimable(not executed > > without this patch) and lru_add_drain,including percpu lock competition. > > I bind a new task to allocate memory and use the same batch lock to compete > > with the target process, on the same CPU. > > context: > > 1:stress --vm 1 --vm-bytes 1g (bind to cpu0) > > 2:stress --vm 1 --vm-bytes 5g --vm-hang 0(bind to cpu0) > > 3:reclaim pages, and writeback 5G zswap_entry in cpu0 and node 0. > > > > Average time of five tests > > > > Base patch patch + compete > > 4.947 5.0676 5.1336 > > +2.4% +3.7% > > compete means: a new stress run in cpu0 to compete with the writeback process. > > PID USER %CPU %MEM TIME+ COMMAND P > > 1367 root 49.5 0.0 1:09.17 bash (writeback) 0 > > 1737 root 49.5 2.2 0:27.46 stress (use percpu > > lock) 0 > > > > around 2.4% increase in real time,including the execution of > > folio_rotate_reclaimable(not executed without this patch) and lru_add_drain,but > > no lock contentions. > > Hmm looks like the regression is still there, no? Yes, it cannot be eliminated. > > > > > around 1.3% additional increase in real time with lock contentions on the same > > cpu. > > > > There is another option here, which is not to move the page to the > > tail of the inactive > > list after end_writeback and delete the following code in > > zswap_writeback_entry(), > > which did not work properly. But the pages will not be released first. > > > > /* move it to the tail of the inactive list after end_writeback */ > > SetPageReclaim(page); > > Or only SetPageReclaim on pages on LRU? No, all the pages are newly allocted and not on LRU. This patch should add lru_add_drain() directly, remove the if statement. The purpose of writing back data is to release the page, so I think it is necessary to fix it. Thanks for your time, Nhat.