Re: [PATCH RFC] mm: fix refcount check in mapping_evict_folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



CCing Miaohe Lin

On Tue, Aug 13, 2024 at 11:25:56AM GMT, Boris Burkov wrote:
> The commit e41c81d0d30e ("mm/truncate: Replace page_mapped() call in
> invalidate_inode_page()") replaced the page_mapped(page) check with a
> refcount check. However, this refcount check does not work as expected
> with drop_caches, at least for btrfs's metadata pages.
> 
> Btrfs has a per-sb metadata inode with cached pages, and when not in
> active use by btrfs, they have a refcount of 3. One from the initial
> call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
> from folio_attach_private. We would expect such pages to get dropped by
> drop_caches. However, drop_caches calls into mapping_evict_folio via
> mapping_try_invalidate which gets a reference on the folio with
> find_lock_entries(). As a result, these pages have a refcount of 4, and
> fail this check.
> 
> For what it's worth, such pages do get reclaimed under memory pressure,
> and if I change this refcount check to `if folio_mapped(folio)`
> 
> The following script produces such pages and uses drgn to further
> analyze the state of the folios:
> https://github.com/boryas/scripts/blob/main/sh/strand-meta/run.sh
> It should at least outline the basic idea for producing some btrfs
> metadata via creating inlined-extent files.
> 
> My proposed fix for the issue is to add one more hardcoded refcount to
> this check to account for the caller having a refcount on the page.
> However, I am less familiar with the other caller into
> mapping_evict_folio in the page fault path, so I am concerned this fix
> will not work properly there, and would appreciate extra scrutiny there.
> 
> Fixes: e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
> Signed-off-by: Boris Burkov <boris@xxxxxx>
> ---
>  mm/truncate.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 4d61fbdd4b2f..c710c84710b4 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -267,9 +267,17 @@ long mapping_evict_folio(struct address_space *mapping, struct folio *folio)
>  		return 0;
>  	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>  		return 0;
> -	/* The refcount will be elevated if any page in the folio is mapped */
> +	/*
> +	 * The refcount will be elevated if any page in the folio is mapped.
> +	 *
> +	 * The refcounts break down as follows:
> +	 * 1 per mapped page
> +	 * 1 from folio_attach_private, if private is set
> +	 * 1 from allocating the page in the first place
> +	 * 1 from the caller
> +	 */

I think the above explanation is correct at least from my code
inspection. Most of the callers are related to memory failure. I would
reword the "1 per mapped page" to "1 per page in page cache" or
something as mapped here might mean mapped in page tables.

>  	if (folio_ref_count(folio) >
> -			folio_nr_pages(folio) + folio_has_private(folio) + 1)
> +			folio_nr_pages(folio) + folio_has_private(folio) + 1 + 1)
>  		return 0;
>  	if (!filemap_release_folio(folio, 0))
>  		return 0;
> -- 
> 2.46.0
> 
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux