On Fri, Jun 14, 2024 at 5:06 AM Chengming Zhou <chengming.zhou@xxxxxxxxx> wrote: > > On 2024/6/14 18:07, Usama Arif wrote: > > Approximately 10-20% of pages to be swapped out are zero pages [1]. > > Rather than reading/writing these pages to flash resulting > > in increased I/O and flash wear, a bitmap can be used to mark these > > pages as zero at write time, and the pages can be filled at > > read time if the bit corresponding to the page is set. > > With this patch, NVMe writes in Meta server fleet decreased > > by almost 10% with conventional swap setup (zswap disabled). > > > > [1] https://lore.kernel.org/all/20171018104832epcms5p1b2232e2236258de3d03d1344dde9fce0@epcms5p1/ > > > > Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx> > > Looks good to me, only some small nits below. > > Reviewed-by: Chengming Zhou <chengming.zhou@xxxxxxxxx> > > > --- > > include/linux/swap.h | 1 + > > mm/page_io.c | 113 ++++++++++++++++++++++++++++++++++++++++++- > > mm/swapfile.c | 15 ++++++ > > 3 files changed, 128 insertions(+), 1 deletion(-) > > > [...] > > + > > +static void swap_zeromap_folio_set(struct folio *folio) > > +{ > > + struct swap_info_struct *sis = swp_swap_info(folio->swap); > > + swp_entry_t entry; > > + unsigned int i; > > + > > + for (i = 0; i < folio_nr_pages(folio); i++) { > > + entry = page_swap_entry(folio_page(folio, i)); > > It seems simpler to use: > > swp_entry_t entry = folio->swap; > > for (i = 0; i < folio_nr_pages(folio); i++, entry.val++) I was actually thinking we could introduce folio_swap_entry(folio, i) after the series. Multiple callers of page_swap_entry() have a folio already. It would save some compound_head() calls. Alternatively, for this patch we can introduce zeromap_update_range(zeromap, offset, size, value). Then we can use it in swap_zeromap_folio_set/cear() as well as swap_range_free(). It would also be a good place to park the comment about using atomic operations (set_bit() and clear_bit()).