On Mon, 13 Mar 2023 20:45:21 +0800 Yin Fengwei <fengwei.yin@xxxxxxxxx> wrote: > This series is trying to bring the batched rmap removing to > try_to_unmap_one(). It's expected that the batched rmap > removing bring performance gain than remove rmap per page. > > This series reconstruct the try_to_unmap_one() from: > loop: > clear and update PTE > unmap one page > goto loop > to: > loop: > clear and update PTE > goto loop > unmap the range of folio in one call > It is one step to always map/unmap the entire folio in one call. > Which can simplify the folio mapcount handling by avoid dealing > with each page map/unmap. > > ... > > For performance gain demonstration, changed the MADV_PAGEOUT not > to split the large folio for page cache and created a micro > benchmark mainly as following: Please remind me why it's necessary to patch the kernel to actually performance test this? And why it's proving so hard to demonstrate benefits in real-world workloads? (Yes, this was touched on in earlier discussion, but I do think these considerations should be spelled out in the [0/N] changelog). Thanks.