On Mon, Mar 21, 2016 at 02:22:50PM +0100, Jan Kara wrote: > Currently we forbid page_cache_tree_insert() to replace exceptional radix > tree entries for DAX inodes. However to make DAX faults race free we will > lock radix tree entries and when hole is created, we need to replace > such locked radix tree entry with a hole page. So modify > page_cache_tree_insert() to allow that. Perhaps this is addressed later in the series, but at first glance this seems unsafe to me - what happens of we had tasks waiting on the locked entry? Are they woken up when we replace the locked entry with a page cache entry? If they are woken up, will they be alright with the fact that they were sleeping on a lock for an exceptional entry, and now that lock no longer exists? The radix tree entry will now be filled with a zero page hole, and so they can't acquire the lock they were sleeping for - instead if they wanted to lock that page they'd need to lock the zero page page lock, right? > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > include/linux/dax.h | 6 ++++++ > mm/filemap.c | 18 +++++++++++------- > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/include/linux/dax.h b/include/linux/dax.h > index 7c45ac7ea1d1..4b63923e1f8d 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -3,8 +3,14 @@ > > #include <linux/fs.h> > #include <linux/mm.h> > +#include <linux/radix-tree.h> > #include <asm/pgtable.h> > > +/* > + * Since exceptional entries do not use indirect bit, we reuse it as a lock bit > + */ > +#define DAX_ENTRY_LOCK RADIX_TREE_INDIRECT_PTR > + > ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, > get_block_t, dio_iodone_t, int flags); > int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size); > diff --git a/mm/filemap.c b/mm/filemap.c > index 7c00f105845e..fbebedaf719e 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -597,14 +597,18 @@ static int page_cache_tree_insert(struct address_space *mapping, > if (!radix_tree_exceptional_entry(p)) > return -EEXIST; > > - if (WARN_ON(dax_mapping(mapping))) > - return -EINVAL; > - > - if (shadowp) > - *shadowp = p; > mapping->nrexceptional--; > - if (node) > - workingset_node_shadows_dec(node); > + if (!dax_mapping(mapping)) { > + if (shadowp) > + *shadowp = p; > + if (node) > + workingset_node_shadows_dec(node); > + } else { > + /* DAX can replace empty locked entry with a hole */ > + WARN_ON_ONCE(p != > + (void *)(RADIX_TREE_EXCEPTIONAL_ENTRY | > + DAX_ENTRY_LOCK)); > + } > } > radix_tree_replace_slot(slot, page); > mapping->nrpages++; > -- > 2.6.2 > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html