On Thu, May 12, 2016 at 06:29:18PM +0200, Jan Kara wrote: > Currently DAX page fault locking is racy. > > CPU0 (write fault) CPU1 (read fault) > > __dax_fault() __dax_fault() > get_block(inode, block, &bh, 0) -> not mapped > get_block(inode, block, &bh, 0) > -> not mapped > if (!buffer_mapped(&bh)) > if (vmf->flags & FAULT_FLAG_WRITE) > get_block(inode, block, &bh, 1) -> allocates blocks > if (page) -> no > if (!buffer_mapped(&bh)) > if (vmf->flags & FAULT_FLAG_WRITE) { > } else { > dax_load_hole(); > } > dax_insert_mapping() > > And we are in a situation where we fail in dax_radix_entry() with -EIO. > > Another problem with the current DAX page fault locking is that there is > no race-free way to clear dirty tag in the radix tree. We can always > end up with clean radix tree and dirty data in CPU cache. > > We fix the first problem by introducing locking of exceptional radix > tree entries in DAX mappings acting very similarly to page lock and thus > synchronizing properly faults against the same mapping index. The same > lock can later be used to avoid races when clearing radix tree dirty > tag. > > Reviewed-by: NeilBrown <neilb@xxxxxxxx> > Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- <> > @@ -897,13 +1166,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > * the write to insert a dirty entry. > */ > if (write) { > - error = dax_radix_entry(mapping, pgoff, dax.sector, > - true, true); > - if (error) { > - dax_pmd_dbg(&bh, address, > - "PMD radix insertion failed"); > - goto fallback; > - } > + /* > + * We should insert radix-tree entry and dirty it here. > + * For now this is broken... > + */ With this change the 'error' variable in __dax_pmd_fault() is now unused, resulting in a compiler warning. fs/dax.c: In function ‘__dax_pmd_fault’: fs/dax.c:1019:6: warning: unused variable ‘error’ [-Wunused-variable] int error, result = 0; ^ -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html