On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote: > On 10/19/2017 07:30 PM, Naoya Horiguchi wrote: > > On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote: > > > > Thank you for addressing this. The patch itself looks good to me, but > > the reported issue (negative reserve count) doesn't reproduce in my trial > > with v4.14-rc5, so could you share the exact procedure for this issue? > > Sure, but first one question on your test scenario below. > > > > > When error handler runs over a huge page, the reserve count is incremented > > so I'm not sure why the reserve count goes negative. > > I'm not sure I follow. What specific code is incrementing the reserve > count? The call path is like below: hugetlbfs_error_remove_page hugetlb_fix_reserve_counts hugepage_subpool_get_pages(spool, 1) hugetlb_acct_memory(h, 1); gather_surplus_pages h->resv_huge_pages += delta; > > > My operation is like below: > > > > $ sysctl vm.nr_hugepages=10 > > $ grep HugePages_ /proc/meminfo > > HugePages_Total: 10 > > HugePages_Free: 10 > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > $ ./test_alloc_generic -B hugetlb_file -N1 -L "mmap access memory_error_injection:error_type=madv_hard" // allocate a 2MB file on hugetlbfs, then madvise(MADV_HWPOISON) on it. > > $ grep HugePages_ /proc/meminfo > > HugePages_Total: 10 > > HugePages_Free: 9 > > HugePages_Rsvd: 1 // reserve count is incremented > > HugePages_Surp: 0 > > This is confusing to me. I can not create a test where there is a reserve > count after poisoning page. > > I tried to recreate your test. Running unmodified 4.14.0-rc5. > > Before test > ----------- > HugePages_Total: 1 > HugePages_Free: 1 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > After open(creat) and mmap of 2MB hugetlbfs file > ------------------------------------------------ > HugePages_Total: 1 > HugePages_Free: 1 > HugePages_Rsvd: 1 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > Reserve count is 1 as expected/normal > > After madvise(MADV_HWPOISON) of the single huge page in mapping/file > -------------------------------------------------------------------- > HugePages_Total: 1 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > In this case, the reserve (and free) count were decremented. Note that > before the poison operation the page was not associated with the mapping/ > file. I did not look closely at the code, but assume the madvise may > cause the page to be 'faulted in'. > > The counts remain the same when the program exits > ------------------------------------------------- > HugePages_Total: 1 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > Remove the file (rm /var/opt/oracle/hugepool/foo) > ------------------------------------------------- > HugePages_Total: 1 > HugePages_Free: 0 > HugePages_Rsvd: 18446744073709551615 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > I am still confused about how your test maintains a reserve count after > poisoning. It may be a good idea for you to test my patch with your > test scenario as I can not recreate here. Interestingly, I found that this reproduces if all hugetlb pages are reserved when poisoning. Your testing meets the condition, and mine doesn't. In gather_surplus_pages() we determine whether we extend hugetlb pool with surplus pages like below: needed = (h->resv_huge_pages + delta) - h->free_huge_pages; if (needed <= 0) { h->resv_huge_pages += delta; return 0; } ... needed is 1 if h->resv_huge_pages == h->free_huge_pages, and then the reserve count gets inconsistent. I confirmed that your patch fixes the issue, so I'm OK with it. Acked-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Thanks, Naoya Horiguchi