With the advent of fast random IO devices (SSDs, PMEM) and in-memory swap devices such as zswap, it's possible for swap to be much faster than filesystems, and for swapping to be preferable over thrashing filesystem caches. Allow setting swappiness - which defines the relative IO cost of cache misses between page cache and swap-backed pages - to reflect such situations by making the swap-preferred range configurable. Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> --- Documentation/sysctl/vm.txt | 16 +++++++++++----- kernel/sysctl.c | 3 ++- mm/vmscan.c | 2 +- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 720355cbdf45..54030750cd31 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -771,14 +771,20 @@ with no ill effects: errors and warnings on these stats are suppressed.) swappiness -This control is used to define how aggressive the kernel will swap -memory pages. Higher values will increase agressiveness, lower values -decrease the amount of swap. A value of 0 instructs the kernel not to -initiate swap until the amount of free and file-backed pages is less -than the high water mark in a zone. +This control is used to define the relative IO cost of cache misses +between the swap device and the filesystem as a value between 0 and +200. At 100, the VM assumes equal IO cost and will thus apply memory +pressure to the page cache and swap-backed pages equally. At 0, the +kernel will not initiate swap until the amount of free and file-backed +pages is less than the high watermark in a zone. The default value is 60. +On non-rotational swap devices, a value of 100 (or higher, depending +on what's backing the filesystem) is recommended. + +For in-memory swap, like zswap, values closer to 200 are recommended. + ============================================================== - user_reserve_kbytes diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 2effd84d83e3..56a9243eb171 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -126,6 +126,7 @@ static int __maybe_unused two = 2; static int __maybe_unused four = 4; static unsigned long one_ul = 1; static int one_hundred = 100; +static int two_hundred = 200; static int one_thousand = 1000; #ifdef CONFIG_PRINTK static int ten_thousand = 10000; @@ -1323,7 +1324,7 @@ static struct ctl_table vm_table[] = { .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = &zero, - .extra2 = &one_hundred, + .extra2 = &two_hundred, }, #ifdef CONFIG_HUGETLB_PAGE { diff --git a/mm/vmscan.c b/mm/vmscan.c index c4a2f4512fca..f79010bbcdd4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -136,7 +136,7 @@ struct scan_control { #endif /* - * From 0 .. 100. Higher means more swappy. + * From 0 .. 200. Higher means more swappy. */ int vm_swappiness = 60; /* -- 2.8.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>