Re: [PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 31, 2022 at 08:02:55PM -0300, Mauricio Faria de Oliveira wrote:
> Problem:
> =======

Thanks for the update. A couple of quick questions:

> Userspace might read the zero-page instead of actual data from a
> direct IO read on a block device if the buffers have been called
> madvise(MADV_FREE) on earlier (this is discussed below) due to a
> race between page reclaim on MADV_FREE and blkdev direct IO read.

1) would page migration be affected as well?

> @@ -1599,7 +1599,30 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  
>  			/* MADV_FREE page check */
>  			if (!PageSwapBacked(page)) {
> -				if (!PageDirty(page)) {
> +				int ref_count, map_count;
> +
> +				/*
> +				 * Synchronize with gup_pte_range():
> +				 * - clear PTE; barrier; read refcount
> +				 * - inc refcount; barrier; read PTE
> +				 */
> +				smp_mb();
> +
> +				ref_count = page_count(page);
> +				map_count = page_mapcount(page);
> +
> +				/*
> +				 * Order reads for page refcount and dirty flag;
> +				 * see __remove_mapping().
> +				 */
> +				smp_rmb();

2) why does it need to order against __remove_mapping()? It seems to
   me that here (called from the reclaim path) it can't race with
   __remove_mapping() because both lock the page.

> +				/*
> +				 * The only page refs must be from the isolation
> +				 * plus one or more rmap's (dropped by discard:).
> +				 */
> +				if ((ref_count == 1 + map_count) &&
> +				    !PageDirty(page)) {
>  					/* Invalidate as we cleared the pte */
>  					mmu_notifier_invalidate_range(mm,
>  						address, address + PAGE_SIZE);




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux