Hi Aneesh, On Thu 21-02-19 12:52:39, Aneesh Kumar K.V wrote: > We found this while testing dax with XFS, but i guess this is true for > other file systems too. The stack trace looks as > > [c00000000007610c] set_pte_at+0x3c/0x190 > LR [c000000000378628] insert_pfn+0x208/0x280 > Call Trace: > [c0000002125df980] [8000000000000104] 0x8000000000000104 (unreliable) > [c0000002125df9c0] [c000000000378488] insert_pfn+0x68/0x280 > [c0000002125dfa30] [c0000000004a5494] dax_iomap_pte_fault.isra.7+0x734/0xa40 > [c0000002125dfb50] [c000000000627250] __xfs_filemap_fault+0x280/0x2d0 > [c0000002125dfbb0] [c000000000373abc] do_wp_page+0x48c/0xa40 > [c0000002125dfc00] [c000000000379170] __handle_mm_fault+0x8d0/0x1fd0 > [c0000002125dfd00] [c00000000037a9b0] handle_mm_fault+0x140/0x250 > [c0000002125dfd40] [c000000000074bb0] __do_page_fault+0x300/0xd60 > [c0000002125dfe20] [c00000000000acf4] handle_page_fault+0x18 > > > Now that is WARN_ON in set_pte_at which is > > VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); > > Multiple architecture optimize set_pte_at based on the assumption that > we will never use set_pte_at to update a valid pte entry. This helps in > avoid flushing tlb etc. We should be using ptep_set_access_flags for > this. Hum, I didn't know about this assumption and neither did lot of other people reviewing DAX patches. Is this documented somewhere? > I guess iomap code doesn't handle this correctly? Or am I missing > some other ways we can end up flushing tlb? So for RW->RO transition we use ptep_clear_flush() in dax_entry_mkclean() so that one is certainly safe. Similarly for unmapping. The RO->RW transition does not seem to have any TLB flush so there TLB could still carry stale information but it's the same as with normal page faults on invalid PTEs or with protection faults for normal pages (see e.g. finish_mkwrite_fault()). The only thing that's remaining is a situation when we replace a PTE with zero page with a PTE pointing to a real storage (block allocation on protection fault). However in this case we do unmap_mapping_pages() in dax_insert_entry() so the PTE actually gets cleared before we install a new correct block mapping. So this case is safe as well. Am I missing something? Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR