On Tue, Jun 21, 2016 at 05:45:12PM +0200, Jan Kara wrote: > Hello, > > currently we never clear dirty bits in the radix tree of a DAX inode. Thus > fsync(2) or even periodical writeback flush all the dirty pfns again and > again. This patches implement clearing of the dirty tag in the radix tree > so that we issue flush only when needed. > > The difficulty with clearing the dirty tag is that we have to protect against > a concurrent page fault setting the dirty tag and writing new data into the > page. So we need a lock serializing page fault and clearing of the dirty tag > and write-protecting PTEs (so that we get another pagefault when pfn is written > to again and we have to set the dirty tag again). > > The effect of the patch set is easily visible: > > Writing 1 GB of data via mmap, then fsync twice. > > Before this patch set both fsyncs take ~205 ms on my test machine, after the > patch set the first fsync takes ~283 ms (the additional cost of walking PTEs, > clearing dirty bits etc. is very noticeable), the second fsync takes below > 1 us. > > As a bonus, these patches make filesystem freezing for DAX filesystems > reliable because mappings are now properly writeprotected while freezing the > fs. > > Patches have passed xfstests for both xfs and ext4. > > So far the patches don't work with PMD pages - that's next on my todo list. Regarding the PMD work, I had a go at this a while ago. You may (or may not) find these patches useful: mm: add follow_pte_pmd() https://patchwork.kernel.org/patch/7616241/ mm: add pmd_mkclean() https://patchwork.kernel.org/patch/7616261/ mm: add pgoff_mkclean() https://patchwork.kernel.org/patch/7616221/ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html