On Tue, Mar 7, 2023 at 1:54 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 03/06/23 23:00, James Houghton wrote: > > For compound mappings (compound=true), _nr_pages_mapped will now be > > incremented by COMPOUND_MAPPED when the first compound mapping is > > created. > > This sentence makes it sound like incrementing by COMPOUND_MAPPED for > compound pages is introduced by this patch. Rather, it is just for > hugetlb (now always) compound mappings. Perhaps change that to read: > For hugetlb mappings ... Yes this is kind of confusing. I'll fix it like you suggest. > > > For small mappings, _nr_pages_mapped is incremented by 1 when the > > particular small page is mapped for the first time. This is incompatible > > with HPageVmemmapOptimize()ed folios, as most of the tail page structs > > will be mapped read-only. > > > > Currently HugeTLB always passes compound=true, but in the future, > > HugeTLB pages may be mapped with small mappings. > > > > To implement this change: > > 1. Replace most of HugeTLB's calls to page_dup_file_rmap() with > > page_add_file_rmap(). The call in copy_hugetlb_page_range() is kept. > > 2. Update page_add_file_rmap() and page_remove_rmap() to support > > HugeTLB folios. > > 3. Update hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap() to > > also increment _nr_pages_mapped properly. > > > > With these changes, folio_large_is_mapped() no longer needs to check > > _entire_mapcount. > > > > HugeTLB doesn't use LRU or mlock, so page_add_file_rmap() and > > page_remove_rmap() excludes those pieces. It is also important that > > the folio_test_pmd_mappable() check is removed (or changed), as it's > > possible to have a HugeTLB page whose order is not >= HPAGE_PMD_ORDER, > > like arm64's CONT_PTE_SIZE HugeTLB pages. > > > > This patch limits HugeTLB pages to 16G in size. That limit can be > > increased if COMPOUND_MAPPED is raised. > > > > Signed-off-by: James Houghton <jthoughton@xxxxxxxxxx> > > > > Thanks! > > This is a step in the direction of having hugetlb use the same mapcount > scheme as elsewhere. As you mention, with this in place future mapcount > changes should mostly 'just work' for hugetlb. > > Because of this, > Acked-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Thanks! > > I have a few nits below, and I'm sure others will chime in later. > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index ba901c416785..4a975429b91a 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1316,19 +1316,21 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, > > int nr = 0, nr_pmdmapped = 0; > > bool first; > > > > - VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); > > + VM_BUG_ON_PAGE(compound && !PageTransHuge(page) > > + && !folio_test_hugetlb(folio), page); > > > > /* Is page being mapped by PTE? Is this its first map to be added? */ > > if (likely(!compound)) { > > + if (unlikely(folio_test_hugetlb(folio))) > > + VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page), > > + page); > > first = atomic_inc_and_test(&page->_mapcount); > > nr = first; > > if (first && folio_test_large(folio)) { > > nr = atomic_inc_return_relaxed(mapped); > > nr = (nr < COMPOUND_MAPPED); > > } > > - } else if (folio_test_pmd_mappable(folio)) { > > - /* That test is redundant: it's for safety or to optimize out */ > > I 'think' removing this check is OK. It would seem that the caller > knows if the folio is mappable. If we want a similar test, we might be > able to use something like: > > arch_hugetlb_valid_size(folio_size(folio)) > Ack. I think leaving the check(s) removed is fine. > > - > > + } else { > > first = atomic_inc_and_test(&folio->_entire_mapcount); > > if (first) { > > nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); > > @@ -1345,6 +1347,9 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, > > } > > } > > > > + if (folio_test_hugetlb(folio)) > > + return; > > IMO, a comment saying hugetlb is special and does not participate in lru > would be appropriate here. Will do. > > > + > > if (nr_pmdmapped) > > __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ? > > NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped); > > @@ -1373,24 +1378,18 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, > > > > VM_BUG_ON_PAGE(compound && !PageHead(page), page); > > > > - /* Hugetlb pages are not counted in NR_*MAPPED */ > > - if (unlikely(folio_test_hugetlb(folio))) { > > - /* hugetlb pages are always mapped with pmds */ > > - atomic_dec(&folio->_entire_mapcount); > > - return; > > - } > > - > > /* Is page being unmapped by PTE? Is this its last map to be removed? */ > > if (likely(!compound)) { > > + if (unlikely(folio_test_hugetlb(folio))) > > + VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page), > > + page); > > last = atomic_add_negative(-1, &page->_mapcount); > > nr = last; > > if (last && folio_test_large(folio)) { > > nr = atomic_dec_return_relaxed(mapped); > > nr = (nr < COMPOUND_MAPPED); > > } > > - } else if (folio_test_pmd_mappable(folio)) { > > - /* That test is redundant: it's for safety or to optimize out */ > > - > > + } else { > > last = atomic_add_negative(-1, &folio->_entire_mapcount); > > if (last) { > > nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped); > > @@ -1407,6 +1406,9 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, > > } > > } > > > > + if (folio_test_hugetlb(folio)) > > + return; > > Same as above in page_add_file_rmap. > > > + > > if (nr_pmdmapped) { > > if (folio_test_anon(folio)) > > idx = NR_ANON_THPS; > > @@ -2541,9 +2543,11 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, > > first = atomic_inc_and_test(&folio->_entire_mapcount); > > VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); > > VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); > > - if (first) > > + if (first) { > > + atomic_add(COMPOUND_MAPPED, &folio->_nr_pages_mapped); > > __page_set_anon_rmap(folio, page, vma, address, > > !!(flags & RMAP_EXCLUSIVE)); > > + } > > } > > > > void hugepage_add_new_anon_rmap(struct folio *folio, > > @@ -2552,6 +2556,7 @@ void hugepage_add_new_anon_rmap(struct folio *folio, > > BUG_ON(address < vma->vm_start || address >= vma->vm_end); > > /* increment count (starts at -1) */ > > atomic_set(&folio->_entire_mapcount, 0); > > + atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); > > folio_clear_hugetlb_restore_reserve(folio); > > __page_set_anon_rmap(folio, &folio->page, vma, address, 1); > > } > > Should we look at perhaps modifying page_add_anon_rmap and > folio_add_new_anon_rmap as well? I think I can merge hugepage_add_anon_rmap with page_add_anon_rmap and hugepage_add_new_anon_rmap with folio_add_new_anon_rmap. With them merged, it's pretty easy to see what HugeTLB does differently from generic mm, which is nice. :)