Swap subsystem does lazy swap slot free with expecting the page would be swapped out again so we can avoid unnecessary write. But the problem in in-memory swap(ex, zram) is that it consumes memory space until vm_swap_full(ie, used half of all of swap device) condition meet. It could be bad if we use multiple swap device, small in-memory swap and big storage swap or in-memory swap alone. This patch makes swap subsystem free swap slot as soon as swap-read is completed and make the swapcache page dirty so the page should be written out the swap device to reclaim it. It means we never lose it. I tested this patch with kernel compile workload. 1. before compile time : 9882.42 zram max wasted space by fragmentation: 13471881 byte memory space consumed by zram: 174227456 byte the number of slot free notify: 206684 2. after compile time : 9653.90 zram max wasted space by fragmentation: 11805932 byte memory space consumed by zram: 154001408 byte the number of slot free notify: 426972 * changelog from v1 * Add more comment Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Seth Jennings <sjenning@xxxxxxxxxxxxxxxxxx> Cc: Nitin Gupta <ngupta@xxxxxxxxxx> Cc: Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx> Cc: Shaohua Li <shli@xxxxxxxxxx> Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> --- mm/page_io.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index 50bb6ef..b09d40a 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -21,6 +21,7 @@ #include <linux/writeback.h> #include <linux/frontswap.h> #include <linux/aio.h> +#include <linux/blkdev.h> #include <asm/pgtable.h> static struct bio *get_swap_bio(gfp_t gfp_flags, @@ -82,8 +83,42 @@ void end_swap_bio_read(struct bio *bio, int err, struct batch_complete *batch) iminor(bio->bi_bdev->bd_inode), (unsigned long long)bio->bi_sector); } else { + struct swap_info_struct *sis; + SetPageUptodate(page); + sis = page_swap_info(page); + if (sis->flags & SWP_BLKDEV) { + /* + * Swap subsystem does lazy swap slot free with + * expecting the page would be swapped out again + * so we can avoid unnecessary write if the page + * isn't redirty. + * It's good for real swap storage because we can + * reduce unnecessary I/O and enhance wear-leveling + * if you use SSD as swap device. + * But if you use in-memory swap device(ex, zram), + * it causes duplicated copy between uncompressed + * data in VM-owned memory and compressed data in + * zram-owned memory. So let's free zram-owned memory + * and make the VM-owned decompressed page *dirty* + * so the page should be swap out somewhere again if + * we want to reclaim it, again. + */ + struct gendisk *disk = sis->bdev->bd_disk; + if (disk->fops->swap_slot_free_notify) { + swp_entry_t entry; + unsigned long offset; + + entry.val = page_private(page); + offset = swp_offset(entry); + + SetPageDirty(page); + disk->fops->swap_slot_free_notify(sis->bdev, + offset); + } + } } + unlock_page(page); bio_put(bio); } -- 1.8.2.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>