On Wed, Feb 16, 2022 at 02:48:19PM +0800, Huang, Ying wrote: > Yu Zhao <yuzhao@xxxxxxxxxx> writes: > > > On Wed, Feb 02, 2022 at 06:27:47PM -0300, Mauricio Faria de Oliveira wrote: > >> On Wed, Feb 2, 2022 at 4:56 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > >> > > >> > On Mon, Jan 31, 2022 at 08:02:55PM -0300, Mauricio Faria de Oliveira wrote: > >> > > Problem: > >> > > ======= > >> > > >> > Thanks for the update. A couple of quick questions: > >> > > >> > > Userspace might read the zero-page instead of actual data from a > >> > > direct IO read on a block device if the buffers have been called > >> > > madvise(MADV_FREE) on earlier (this is discussed below) due to a > >> > > race between page reclaim on MADV_FREE and blkdev direct IO read. > >> > > >> > 1) would page migration be affected as well? > >> > >> Could you please elaborate on the potential problem you considered? > >> > >> I checked migrate_pages() -> try_to_migrate() holds the page lock, > >> thus shouldn't race with shrink_page_list() -> with try_to_unmap() > >> (where the issue with MADV_FREE is), but maybe I didn't get you > >> correctly. > > > > Could the race exist between DIO and migration? While DIO is writing > > to a page, could migration unmap it and copy the data from this page > > to a new page? > > Check the migrate_pages() code, > > migrate_pages > unmap_and_move > __unmap_and_move > try_to_migrate // set PTE to swap entry with PTL > move_to_new_page > migrate_page > folio_migrate_mapping > folio_ref_count(folio) != expected_count // check page ref count > folio_migrate_copy > > The page ref count is checked after unmapping and before copying. This > is good, but it appears that we need a memory barrier between checking > page ref count and copying page. I didn't look into this but, off the top of head, this should be similar if not identical to the DIO case. Therefore, it requires two barriers -- before and after the refcnt check (which may or may not exist).