On 10/2/20 4:25 AM, Michal Hocko wrote: > On Wed 30-09-20 15:03:11, Mike Kravetz wrote: >> On 9/30/20 1:47 PM, Vijay Balakrishna wrote: >>> On 9/30/2020 11:20 AM, Mike Kravetz wrote: >>>> On 9/29/20 9:49 AM, Vijay Balakrishna wrote: >>>> >>>> Sorry for jumping in so late. Should we use this as an opportunity to >>>> also fix up the messages logged when (re)calculating mfk? They are wrong >>>> and could be quite confusing. >>> >>> >>> Sure. Please share your thoughts regarding appropriate message. Here is what I'm thinking >>> >>> pr_warn("min_free_kbytes is not updated to %d because current value %d is preferred\n", new_min_free_kbytes, min_free_kbytes); >>> >>> If above message is reasonable I can post a new revision (v6). >> >> Just considering the below example, >> >>>> For example consider the following sequence >>>> of operations and corresponding log messages produced. >>>> >>>> Freshly booted VM with 2 nodes and 8GB memory: >>>> # cat /proc/sys/vm/min_free_kbytes >>>> 90112 >>>> # echo 90000 > /proc/sys/vm/min_free_kbytes >>>> # cat /proc/sys/vm/min_free_kbytes >>>> 90000 >>>> # echo 0 > /sys/devices/system/node/node1/memory56/online >>>> [ 135.099947] Offlined Pages 32768 >>>> [ 135.102362] min_free_kbytes is not updated to 11241 because user defined value 90000 is preferred >> >> I am not sure if there is any value in printing the above line. Especially >> in this context as it becomes obsolete with the printing of the next line. > > The original intention was to make it explicit that auto-tuning is > influenced by the user provided configuration. > >>>> [ 135.109070] khugepaged: raising min_free_kbytes from 90000 to 90112 to help t >>>> ransparent hugepage allocations >> >> IMO, the above line is the only one that should be output as a result of the >> recalculation. > > Well, but khugepaged could be disabled and then the above might not get > printed. Sure the code could get reorganized and all that but is this > really worth that? > >> I guess that brings up the question of 'should we continue to track the user >> defined value if we overwrite it?". If we quit tracking it may help with the >> next message. > > Auto tuning and user provided override is quite tricky to get sensible. > Especially in the case here. Admin has provided an override but has the > potential memory hotplug been considered? Or to make it even more > complicated, consider that the hotplug happens without admin involvement > - e.g. memory gets hotremoved due to HW problems. Is the admin provided > value still meaningful? To be honest I do not have a good answer and I > am not sure we should care all that much until we see practical > problems. I am not insisting that this be cleaned up. The change in this patch to ensure THP related calculations are performed during hotplug is the most important. I became aware of the logging issues when looking at a customer issue with an older kernel. The min_free_kbytes setting was integral to the issue we were investigating, and it was unclear whether or not the customer had changed the value. I knew the system log should contain evidence of manually setting min_free_kbytes. However, there was no evidence in the log. Turns out the customer did not change the value, but it did cause me to do a deep dive into the logging code. -- Mike Kravetz