Le 23/08/2024 à 21:04, Usama Arif a écrit :
Approximately 10-20% of pages to be swapped out are zero pages [1].
Rather than reading/writing these pages to flash resulting
in increased I/O and flash wear, a bitmap can be used to mark these
pages as zero at write time, and the pages can be filled at
read time if the bit corresponding to the page is set.
With this patch, NVMe writes in Meta server fleet decreased
by almost 10% with conventional swap setup (zswap disabled).
[1] https://lore.kernel.org/all/20171018104832epcms5p1b2232e2236258de3d03d1344dde9fce0@epcms5p1/
...
@@ -3428,6 +3444,17 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
goto bad_swap_unlock_inode;
}
+ /*
+ * Use kvmalloc_array instead of bitmap_zalloc as the allocation order might
+ * be above MAX_PAGE_ORDER incase of a large swap file.
+ */
+ zeromap = kvmalloc_array(BITS_TO_LONGS(maxpages), sizeof(long),
+ GFP_KERNEL | __GFP_ZERO);
Nitpick: kvcalloc() maybe, to be slightly less verbose?
+ if (!zeromap) {
+ error = -ENOMEM;
+ goto bad_swap_unlock_inode;
+ }
+
if (si->bdev && bdev_stable_writes(si->bdev))
si->flags |= SWP_STABLE_WRITES;
...
CJ