On 2/4/20 1:53 PM, Matthew Wilcox wrote: > On Tue, Feb 04, 2020 at 01:42:43PM -0800, Mike Kravetz wrote: >> On 2/4/20 12:33 PM, David Rientjes wrote: >>> On Tue, 4 Feb 2020, Mike Kravetz wrote: >>> >>> Hmm, if khugepaged_adjust_min_free_kbytes() increases min_free_kbytes for >>> thp, then the user has no ability to override this increase by using >>> vm.min_free_kbytes? >>> >>> IIUC, with this change, it looks like memory hotplug events properly >>> increase min_free_kbytes for thp optimization but also doesn't respect a >>> previous user-defined value? >> >> Good catch. >> >> We should only call khugepaged_adjust_min_free_kbytes from the 'true' >> block of this if statement in init_per_zone_wmark_min. >> >> if (new_min_free_kbytes > user_min_free_kbytes) { >> min_free_kbytes = new_min_free_kbytes; >> if (min_free_kbytes < 128) >> min_free_kbytes = 128; >> if (min_free_kbytes > 65536) >> min_free_kbytes = 65536; >> } else { >> pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n", >> new_min_free_kbytes, user_min_free_kbytes); >> } >> >> In the existing code, a hotplug event will cause min_free_kbytes to overwrite >> the user defined value if the new value is greater. However, you will get >> the warning message if the user defined value is greater. I am not sure if >> this is the 'desired/expected' behavior? We print a warning if the user value >> takes precedence over our calculated value. However, we do not print a message >> if we overwrite the user defined value. That doesn't seem right! >> >>> So it looks like this is fixing an obvious correctness issue but also now >>> requires users to rewrite the sysctl if they want to decrease the min >>> watermark. >> >> Moving the call to khugepaged_adjust_min_free_kbytes as described above >> would avoid the THP adjustment unless we were going to overwrite the >> user defined value. Now, I am not sure overwriting the user defined value >> as is done today is actually the correct thing to do. >> >> Thoughts? >> Perhaps we should never overwrite a user defined value? > > We should certainly warn if we would have adjusted it, had they not > changed it! Ok, the code above does that today. > I'm reluctant to suggest we do a more complex adjustment of the value > (eg figure out what the adjustment would have been, then apply some > fraction of that adjustment to keep the ratios in proportion) because > we don't really know why they adjusted it. Agree! > OTOH, we should adjust it if the user-set min_free_kbytes is now too > large for the amount of memory now in the machine. Today, we never overwrite a user defined value that is larger than that calculated by the code. However, we will owerwrite a user defined value if the code calculates a larger value. I'm starting to think the best option is to NEVER overwrite a user defined value. -- Mike Kravetz