If parallel fault occur, we can fail to allocate a hugepage, because many threads dequeue a hugepage to handle a fault of same address. This makes reserved pool shortage just for a little while and this cause faulting thread who is ensured to have enough reserved hugepages to get a SIGBUS signal. To solve this problem, we already have a nice solution, that is, a hugetlb_instantiation_mutex. This blocks other threads to dive into a fault handler. This solve the problem clearly, but it introduce performance degradation, because it serialize all fault handling. Now, I try to remove a hugetlb_instantiation_mutex to get rid of performance degradation. A prerequisite is that other thread should not get a SIGBUS if they are ensured to have enough reserved pages. For this purpose, if we fail to allocate a new hugepage with use_reserve, we return just 0, instead of VM_FAULT_SIGBUS. use_reserve represent that this user is legimate one who are ensured to have enough reserved pages. This prevent these thread not to get a SIGBUS signal and make these thread retrying fault handling. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6a9ec69..909075b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2623,7 +2623,10 @@ retry_avoidcopy: WARN_ON_ONCE(1); } - ret = VM_FAULT_SIGBUS; + if (use_reserve) + ret = 0; + else + ret = VM_FAULT_SIGBUS; goto out_lock; } @@ -2741,7 +2744,10 @@ retry: page = alloc_huge_page(vma, address, use_reserve); if (IS_ERR(page)) { - ret = VM_FAULT_SIGBUS; + if (use_reserve) + ret = 0; + else + ret = VM_FAULT_SIGBUS; goto out; } clear_huge_page(page, address, pages_per_huge_page(h)); -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>