Ackerley Tng <ackerleytng@xxxxxxxxxx> writes: > Peter Xu <peterx@xxxxxxxxxx> writes: > >> On Tue, Sep 10, 2024 at 11:43:45PM +0000, Ackerley Tng wrote: >>> +/** >>> + * Removes folios in range [@lstart, @lend) from page cache of inode, updates >>> + * inode metadata and hugetlb reservations. >>> + */ >>> +static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, >>> + loff_t lstart, loff_t lend) >>> +{ >>> + struct kvm_gmem_hugetlb *hgmem; >>> + struct hstate *h; >>> + int gbl_reserve; >>> + int num_freed; >>> + >>> + hgmem = kvm_gmem_hgmem(inode); >>> + h = hgmem->h; >>> + >>> + num_freed = kvm_gmem_hugetlb_filemap_remove_folios(inode->i_mapping, >>> + h, lstart, lend); >>> + >>> + gbl_reserve = hugepage_subpool_put_pages(hgmem->spool, num_freed); >>> + hugetlb_acct_memory(h, -gbl_reserve); >> >> I wonder whether this is needed, and whether hugetlb_acct_memory() needs to >> be exported in the other patch. >> >> IIUC subpools manages the global reservation on its own when min_pages is >> set (which should be gmem's case, where both max/min set to gmem size). >> That's in hugepage_put_subpool() -> unlock_or_release_subpool(). >> > > Thank you for pointing this out! You are right and I will remove > hugetlb_acct_memory() from here. > I looked further at the folio cleanup process in free_huge_folio() and I realized I should be returning the pages to the subpool via free_huge_folio(). There should be no call to hugepage_subpool_put_pages() directly from this truncate function. To use free_huge_folio() to return the pages to the subpool, I will clear the restore_reserve flag once guest_memfd allocates a folio. All the guest_memfd hugetlb folios will always have the restore_reserve flag cleared. With the restore_reserve flag cleared, free_huge_folio() will do hugepage_subpool_put_pages(), and then restore the reservation in hstate as well. Returning the folio to the subpool on freeing is important and correct, since if/when the folio_put() callback is used, the filemap may not hold the last refcount on the folio, so truncation may not be when the folio should not be returned to the subpool. >>> + >>> + spin_lock(&inode->i_lock); >>> + inode->i_blocks -= blocks_per_huge_page(h) * num_freed; >>> + spin_unlock(&inode->i_lock); >>> +}