Mike Kravetz <mike.kravetz@xxxxxxxxxx> writes: > Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly > allocated huge page. If a reservation was associated with the huge > page, alloc_huge_page() consumed the reservation while allocating. > When the newly allocated page is freed in free_huge_page(), it will > increment the global reservation count. However, the reservation entry > in the reserve map will remain. This is not an issue for shared > mappings as the entry in the reserve map indicates a reservation exists. > But, an entry in a private mapping reserve map indicates the reservation > was consumed and no longer exists. This results in an inconsistency > between the reserve map and the global reservation count. This 'leaks' > a reserved huge page. > > Create a new routine restore_reserve_on_error() to restore the reserve > entry in these specific error paths. This routine makes use of a new > function vma_add_reservation() which will add a reserve entry for a > specific address/page. > > In general, these error paths were rarely (if ever) taken on most > architectures. However, powerpc contained arch specific code that > that resulted in an extra fault and execution of these error paths > on all private mappings. > > Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings) > > Cc: stable@xxxxxxxxxxxxxxx > Reported-by: Jan Stancek <jstancek@xxxxxxxxxx> > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > --- > mm/hugetlb.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 66 insertions(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ec49d9e..418bf01 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1826,11 +1826,17 @@ static void return_unused_surplus_pages(struct hstate *h, > * is not the case is if a reserve map was changed between calls. It > * is the responsibility of the caller to notice the difference and > * take appropriate action. > + * > + * vma_add_reservation is used in error paths where a reservation must > + * be restored when a newly allocated huge page must be freed. It is > + * to be called after calling vma_needs_reservation to determine if a > + * reservation exists. > */ > enum vma_resv_mode { > VMA_NEEDS_RESV, > VMA_COMMIT_RESV, > VMA_END_RESV, > + VMA_ADD_RESV, > }; > static long __vma_reservation_common(struct hstate *h, > struct vm_area_struct *vma, unsigned long addr, > @@ -1856,6 +1862,14 @@ static long __vma_reservation_common(struct hstate *h, > region_abort(resv, idx, idx + 1); > ret = 0; > break; > + case VMA_ADD_RESV: > + if (vma->vm_flags & VM_MAYSHARE) > + ret = region_add(resv, idx, idx + 1); > + else { > + region_abort(resv, idx, idx + 1); > + ret = region_del(resv, idx, idx + 1); > + } It is confusing to find ADD_RESV doing region_del, but I don't have suggestion for a better name. -aneesh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>