On Thu 12-03-20 12:54:19, Ivan Teterevkov wrote: > On Thurs, 12 Mar 2020, Michal Hocko wrote: > > > On Wed 11-03-20 17:45:58, Ivan Teterevkov wrote: > > > This patch adds a couple of knobs: > > > > > > - The configuration option (CONFIG_VM_SWAPPINESS). > > > - The command line parameter (vm_swappiness). > > > > > > The default value is preserved, but now defined by CONFIG_VM_SWAPPINESS. > > > > > > Historically, the default swappiness is set to the well-known value > > > 60, and this works well for the majority of cases. The vm_swappiness > > > is also exposed as the kernel parameter that can be changed at runtime too, > > e.g. > > > with sysctl. > > > > > > This approach might not suit well some configurations, e.g. > > > systemd-based distros, where systemd is put in charge of the cgroup > > > controllers, including the memory one. In such cases, the default > > > swappiness 60 is copied across the cgroup subtrees early at startup, > > > when systemd is arranging the slices for its services, before the > > > sysctl.conf or tmpfiles.d/*.conf changes are applied. > > > > > > One could run a script to traverse the cgroup trees later and set the > > > desired memory.swappiness individually in each occurrence when the > > > runtime is set up, but this would require some amount of work to > > > implement properly. Instead, why not set the default swappiness as early as > > possible? > > > > I have to say I am not a great fan of more tunning for swappiness as this is quite > > a poor tunning for many years already. It essentially does nothing in many cases > > because the reclaim process ignores to value in many cases (have a look a > > get_scan_count. I have seen quite some reports that setting a specific value for > > vmswappiness didn't make any change. The knob itself has a terrible semantic to > > begin with because there is no way to express I really prefer to swap rather than > > page cache reclaim. > > > > This all makes me think that swappiness is a historical mistake that we should > > rather make obsolete than promote even further. > > Absolutely agree, the semantics of the vm_swappiness is perplexing. > Moreover, the same get_scan_count treats vm_swappiness and cgroups > memory.swappiness differently, in particular, 0 disables the memcg swap. > > Certainly, the patch adds some additional exposure to a parameter that > is not trivial to tackle but it's already getting created with a magic > number which is also confusing. Is there any harm to be done by the patch > considering the already existing sysctl interface to that knob? Like any other config option/kernel parameter. It is adding the the overall config space size problem and unless this is really needed I would rather not make it worse. -- Michal Hocko SUSE Labs