On Thu 21-01-16 10:46:02, Ross Zwisler wrote: > Several of the subtleties and assumptions of the DAX fsync/msync > implementation are not immediately obvious, so document them with comments. > > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > Reported-by: Jan Kara <jack@xxxxxxx> Thanks, the comments really help! Just two nits below, otherwise feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> > --- > fs/dax.c | 30 ++++++++++++++++++++++++++++++ > 1 file changed, 30 insertions(+) > > diff --git a/fs/dax.c b/fs/dax.c > index d589113..55ae394 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -350,6 +350,13 @@ static int dax_radix_entry(struct address_space *mapping, pgoff_t index, > > if (!pmd_entry || type == RADIX_DAX_PMD) > goto dirty; > + > + /* > + * We only insert dirty PMD entries into the radix tree. This > + * means we don't need to worry about removing a dirty PTE > + * entry and inserting a clean PMD entry, thus reducing the > + * range we would flush with a follow-up fsync/msync call. > + */ May be acompany this with: WARN_ON(pmd_entry && !dirty); somewhere in dax_radix_entry()? > radix_tree_delete(&mapping->page_tree, index); > mapping->nrexceptional--; > } > @@ -912,6 +919,21 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > } > dax_unmap_atomic(bdev, &dax); > > + /* > + * For PTE faults we insert a radix tree entry for reads, and > + * leave it clean. Then on the first write we dirty the radix > + * tree entry via the dax_pnf_mkwrite() path. This sequence ^^^ pfn Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html