On Wed, Sep 04, 2024 at 09:02:59PM +0100, Matthew Wilcox wrote: > On Wed, Sep 04, 2024 at 12:41:08PM -0700, Steve Sistare wrote: > > The folio_try_get in memfd_alloc_folio is not necessary. Delete it, and > > delete the matching folio_put in memfd_pin_folios. This also avoids > > leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache > > fails, which would otherwise need an additional folio_put. This is a > > continuation of the fix > > "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak" > > I think you're right, but don't we also need to get rid of the > folio_put() call in the 'if (err)' case after calling > hugetlb_add_to_page_cache()? After scratching my head about this a bit more, I was trying to preserve the existing semantics of the code, but I think the code was always buggy. The correct code would be: folio = alloc_hugetlb_folio_nodemask(...); folio_put(folio); The code as in tree today would trip an assertion: VM_BUG_ON_FOLIO(folio_ref_count(folio), folio); as alloc_hugetlb_folio_nodemask() returns a folio with a refcount 1, folio_try_get() would increment it to 2, folio_put() would decrement it to 1, and so we'd call free_huge_folio() with a refcount of 1. But after your patch, the code _is_ still wrong because we'll start with a refcount of 1, fail to add to the page cache, call folio_put() which will decrement the refcount to 0 _and call free_huge_folio() itself_. Then we'll call free_huge_folio() on an already freed and possibly reallocated folio. So every version suggested so far (current, yours, mine) is wrong, and the right code looks like: folio = alloc_hugetlb_folio_nodemask(...); if (folio) { err = hugetlb_add_to_page_cache(...); if (err) { folio_put(folio); return ERR_PTR(err); } ... Or have I got something wrong in that analysis? > > Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") > > > > Suggested-by: Vivek Kasireddy <vivek.kasireddy@xxxxxxxxx> > > Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> > > --- > > mm/gup.c | 4 +--- > > mm/memfd.c | 2 +- > > 2 files changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index bccabaa..947881ff 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, > > pgoff_t start_idx, end_idx, next_idx; > > struct folio *folio = NULL; > > struct folio_batch fbatch; > > - struct hstate *h = NULL; > > + struct hstate *h; > > long ret = -EINVAL; > > > > if (start < 0 || start > end || !max_folios) > > @@ -3662,8 +3662,6 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, > > &fbatch); > > if (folio) { > > folio_put(folio); > > - if (h) > > - folio_put(folio); > > folio = NULL; > > } > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > index bcb131d..f715301 100644 > > --- a/mm/memfd.c > > +++ b/mm/memfd.c > > @@ -89,7 +89,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > > numa_node_id(), > > NULL, > > gfp_mask); > > - if (folio && folio_try_get(folio)) { > > + if (folio) { > > err = hugetlb_add_to_page_cache(folio, > > memfd->f_mapping, > > idx); > > -- > > 1.8.3.1 > > >