On Thu, 20 Feb 2020 10:01:03 -0500 Joel Savitz <jsavitz@xxxxxxxxxx> wrote: > > Currently, the vm.min_free_kbytes sysctl value is capped at a hardcoded > 64M in init_per_zone_wmark_min (unless it is overridden by khugepaged > initialization). > > This value has not been modified since 2005, and enterprise-grade > systems now frequently have hundreds of GB of RAM and multiple 10, 40, > or even 100 GB NICs. We have seen page allocation failures on heavily > loaded systems related to NIC drivers. These issues were resolved by an > increase to vm.min_free_kbytes. > > This patch increases the hardcoded value by a factor of 4 as a temporary > solution. OK, better than nothing I guess. > Further work to make the calculation of vm.min_free_kbytes more > consistent throughout the kernel would be desirable. > > As an example of the inconsistency of the current method, this value is > recalculated by init_per_zone_wmark_min() in the case of memory hotplug > which will override the value set by set_recommended_min_free_kbytes() > called during khugepaged initialization even if khugepaged remains > enabled, however an on/off toggle of khugepaged will then recalculate > and set the value via set_recommended_min_free_kbytes(). > Yup, these hardcoded numbers are lame.