On Fri, 24 Aug 2018 17:45:42 +0200 Jan Kara <jack@xxxxxxx> wrote: > In DAX mode a write pagefault can race with write(2) in the following > way: > > CPU0 CPU1 > write fault for mapped zero page (hole) > dax_iomap_rw() > iomap_apply() > xfs_file_iomap_begin() > - allocates blocks > dax_iomap_actor() > invalidate_inode_pages2_range() > - invalidates radix tree entries in given range > dax_iomap_pte_fault() > grab_mapping_entry() > - no entry found, creates empty > ... > xfs_file_iomap_begin() > - finds already allocated block > ... > vmf_insert_mixed_mkwrite() > - WARNs and does nothing because there > is still zero page mapped in PTE > unmap_mapping_pages() > > This race results in WARN_ON from insert_pfn() and is occasionally > triggered by fstest generic/344. Note that the race is otherwise > harmless as before write(2) on CPU0 is finished, we will invalidate page > tables properly and thus user of mmap will see modified data from > write(2) from that point on. So just restrict the warning only to the > case when the PFN in PTE is not zero page. > > ... > > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1787,10 +1787,15 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, > * in may not match the PFN we have mapped if the > * mapped PFN is a writeable COW page. In the mkwrite > * case we are creating a writable PTE for a shared > - * mapping and we expect the PFNs to match. > + * mapping and we expect the PFNs to match. If they > + * don't match, we are likely racing with block > + * allocation and mapping invalidation so just skip the > + * update. > */ > - if (WARN_ON_ONCE(pte_pfn(*pte) != pfn_t_to_pfn(pfn))) > + if (pte_pfn(*pte) != pfn_t_to_pfn(pfn)) { > + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(*pte))); > goto out_unlock; > + } > entry = *pte; Shouldn't we just remove the warning? We know it happens and we know why it happens and we know it's harmless. What's the point in scaring people?