On Fri, Feb 9, 2024 at 6:00 AM <chengming.zhou@xxxxxxxxx> wrote: > > From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > > All LRU move interfaces have a problem that it has no effect if the > folio is isolated from LRU (in cpu batch or isolated by shrinker). > Since it can't move/change folio LRU status when it's isolated, mostly > just clear the folio flag and do nothing in this case. > > In our case, a written back and reclaimable folio won't be rotated to > the tail of inactive list, since it's still in cpu lru_add batch. It > may cause the delayed reclaim of this folio and evict other folios. > > This patch changes to queue the reclaimable folio to cpu rotate batch > even when !folio_test_lru(), hoping it will likely be handled after > the lru_add batch which will put folio on the LRU list first, so > will be rotated to the tail successfully when handle rotate batch. > > Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> I don't think the analysis is correct. IIRC, writeback from non reclaim paths doesn't require isolation and the reclaim path doesn't use struct folio_batch lru_add. Did you see any performance improvements with this patch? In general, this kind of patches should have performance numbers to show it really helps (not just in theory). My guess is that you are hitting this problem [1]. [1] https://lore.kernel.org/linux-mm/20221116013808.3995280-1-yuzhao@xxxxxxxxxx/