On Tue, Apr 19 2016, Jan Kara wrote: > Currently DAX page fault locking is racy. > > CPU0 (write fault) CPU1 (read fault) > > __dax_fault() __dax_fault() > get_block(inode, block, &bh, 0) -> not mapped > get_block(inode, block, &bh, 0) > -> not mapped > if (!buffer_mapped(&bh)) > if (vmf->flags & FAULT_FLAG_WRITE) > get_block(inode, block, &bh, 1) -> allocates blocks > if (page) -> no > if (!buffer_mapped(&bh)) > if (vmf->flags & FAULT_FLAG_WRITE) { > } else { > dax_load_hole(); > } > dax_insert_mapping() > > And we are in a situation where we fail in dax_radix_entry() with -EIO. > > Another problem with the current DAX page fault locking is that there is > no race-free way to clear dirty tag in the radix tree. We can always > end up with clean radix tree and dirty data in CPU cache. > > We fix the first problem by introducing locking of exceptional radix > tree entries in DAX mappings acting very similarly to page lock and thus > synchronizing properly faults against the same mapping index. The same > lock can later be used to avoid races when clearing radix tree dirty > tag. > > Signed-off-by: Jan Kara <jack@xxxxxxx> Reviewed-by: NeilBrown <neilb@xxxxxxxx> (for the exception locking bits) I really like how you have structured the code - makes it fairly obviously correct! Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature