On Thurs, 12 Mar 2020, Michal Hocko wrote: > On Wed 11-03-20 17:45:58, Ivan Teterevkov wrote: > > This patch adds a couple of knobs: > > > > - The configuration option (CONFIG_VM_SWAPPINESS). > > - The command line parameter (vm_swappiness). > > > > The default value is preserved, but now defined by CONFIG_VM_SWAPPINESS. > > > > Historically, the default swappiness is set to the well-known value > > 60, and this works well for the majority of cases. The vm_swappiness > > is also exposed as the kernel parameter that can be changed at runtime too, > e.g. > > with sysctl. > > > > This approach might not suit well some configurations, e.g. > > systemd-based distros, where systemd is put in charge of the cgroup > > controllers, including the memory one. In such cases, the default > > swappiness 60 is copied across the cgroup subtrees early at startup, > > when systemd is arranging the slices for its services, before the > > sysctl.conf or tmpfiles.d/*.conf changes are applied. > > > > One could run a script to traverse the cgroup trees later and set the > > desired memory.swappiness individually in each occurrence when the > > runtime is set up, but this would require some amount of work to > > implement properly. Instead, why not set the default swappiness as early as > possible? > > I have to say I am not a great fan of more tunning for swappiness as this is quite > a poor tunning for many years already. It essentially does nothing in many cases > because the reclaim process ignores to value in many cases (have a look a > get_scan_count. I have seen quite some reports that setting a specific value for > vmswappiness didn't make any change. The knob itself has a terrible semantic to > begin with because there is no way to express I really prefer to swap rather than > page cache reclaim. > > This all makes me think that swappiness is a historical mistake that we should > rather make obsolete than promote even further. Absolutely agree, the semantics of the vm_swappiness is perplexing. Moreover, the same get_scan_count treats vm_swappiness and cgroups memory.swappiness differently, in particular, 0 disables the memcg swap. Certainly, the patch adds some additional exposure to a parameter that is not trivial to tackle but it's already getting created with a magic number which is also confusing. Is there any harm to be done by the patch considering the already existing sysctl interface to that knob? > -- > Michal Hocko > SUSE Labs