On Wed, Jan 11, 2023 at 5:58 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > James, > > On Thu, Jan 05, 2023 at 10:18:19AM +0000, James Houghton wrote: > > @@ -751,9 +761,9 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, > > int mapcount = page_mapcount(page); > > > > if (mapcount >= 2) > > - mss->shared_hugetlb += huge_page_size(hstate_vma(vma)); > > + mss->shared_hugetlb += hugetlb_pte_size(hpte); > > else > > - mss->private_hugetlb += huge_page_size(hstate_vma(vma)); > > + mss->private_hugetlb += hugetlb_pte_size(hpte); > > } > > return 0; > > One thing interesting I found with hgm right now is mostly everything will > be counted as "shared" here, I think it's because mapcount is accounted > always to the huge page even if mapped in smaller sizes, so page_mapcount() > to a small page should be huge too because the head page mapcount should be > huge. I'm curious the reasons behind the mapcount decision. > > For example, would that risk overflow with head_compound_mapcount? One 1G > page mapping all 4K takes 0.25M counts, while the limit should be 2G for > atomic_t. Looks like it's possible. The original mapcount approach was "if the hstate-level PTE is present, increment the compound_mapcount by 1" (basically "if any of the hugepage is mapped, increment the compound_mapcount by 1"), but this was painful to implement, so I changed it to what it is now (each new PT mapping increments the compound_mapcount by 1). I think you're right, there is absolutely an overflow risk. :( I'm not sure what the best solution is. I could just go back to the old approach. Regarding when things are accounted in private_hugetlb vs. shared_hugetlb, HGM shouldn't change that, because it only applies to shared mappings (right now anyway). It seems like "private_hugetlb" can include cases where the page is shared but has only one mapping, in which case HGM will change it from "private" to "shared". > > Btw, are the small page* pointers still needed in the latest HGM design? > Is there code taking care of disabling of hugetlb vmemmap optimization for > HGM? Or maybe it's not needed anymore for the current design? The hugetlb vmemmap optimization can still be used with HGM, so there is no code to disable it. We don't need small page* pointers either, except for when we're dealing with mapping subpages, like in hugetlb_no_page. Everything else can handle the hugetlb page as a folio. I hope we can keep compatibility with the vmemmap optimization while solving the mapcount overflow risk. Thanks Peter. - James