On Tue 16-02-21 03:07:13, Eiichi Tsukata wrote: > Hugepages can be preallocated to avoid unpredictable allocation latency. > If we run into 4k page shortage, the kernel can trigger OOM even though > there were free hugepages. When OOM is triggered by user address page > fault handler, we can use oom notifier to free hugepages in user space > but if it's triggered by memory allocation for kernel, there is no way > to synchronously handle it in user space. Can you expand some more on what kind of problem do you see? Hugetlb pages are, by definition, a preallocated, unreclaimable and admin controlled pool of pages. Under those conditions it is expected and required that the sizing would be done very carefully. Why is that a problem in your particular setup/scenario? If the sizing is really done properly and then a random process can trigger OOM then this can lead to malfunctioning of those workloads which do depend on hugetlb pool, right? So isn't this a kinda DoS scenario? > This patch introduces a new sysctl vm.sacrifice_hugepage_on_oom. If > enabled, it first tries to free a hugepage if available before invoking > the oom-killer. The default value is disabled not to change the current > behavior. Why is this interface not hugepage size aware? It is quite different to release a GB huge page or 2MB one. Or is it expected to release the smallest one? To the implementation... [...] > +static int sacrifice_hugepage(void) > +{ > + int ret; > + > + spin_lock(&hugetlb_lock); > + ret = free_pool_huge_page(&default_hstate, &node_states[N_MEMORY], 0); ... no it is going to release the default huge page. This will be 2MB in most cases but this is not given. Unless I am mistaken this will free up also reserved hugetlb pages. This would mean that a page fault would SIGBUS which is very likely not something we want to do right? You also want to use oom nodemask rather than a full one. Overall, I am not really happy about this feature even when above is fixed, but let's hear more the actual problem first. -- Michal Hocko SUSE Labs