On Fri, Nov 11, 2016 at 11:17:51AM +0100, Jan Kara wrote: > On Thu 10-11-16 14:54:31, Ross Zwisler wrote: > > On Tue, Nov 08, 2016 at 12:08:09PM +0100, Jan Kara wrote: > > > Implement basic iomap_begin function that handles reading and use it for > > > DAX reads. > > > > > > Signed-off-by: Jan Kara <jack@xxxxxxx> > > > --- > > > fs/ext4/ext4.h | 2 ++ > > > fs/ext4/file.c | 38 +++++++++++++++++++++++++++++++++++++- > > > fs/ext4/inode.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > 3 files changed, 93 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > > > index 282a51b07c57..098b39910001 100644 > > > --- a/fs/ext4/ext4.h > > > +++ b/fs/ext4/ext4.h > > > @@ -3271,6 +3271,8 @@ static inline bool ext4_aligned_io(struct inode *inode, loff_t off, loff_t len) > > > return IS_ALIGNED(off, blksize) && IS_ALIGNED(len, blksize); > > > } > > > > > > +extern struct iomap_ops ext4_iomap_ops; > > > + > > > #endif /* __KERNEL__ */ > > > > > > #define EFSBADCRC EBADMSG /* Bad CRC detected */ > > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > > > index 9facb4dc5c70..1f25c644cb12 100644 > > > --- a/fs/ext4/file.c > > > +++ b/fs/ext4/file.c > > > @@ -31,6 +31,42 @@ > > > #include "xattr.h" > > > #include "acl.h" > > > > > > +#ifdef CONFIG_FS_DAX > > > +static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) > > > +{ > > > + struct inode *inode = file_inode(iocb->ki_filp); > > > + ssize_t ret; > > > + > > > + inode_lock_shared(inode); > > > + /* > > > + * Recheck under inode lock - at this point we are sure it cannot > > > + * change anymore > > > + */ > > > + if (!IS_DAX(inode)) { > > > + inode_unlock_shared(inode); > > > + /* Fallback to buffered IO in case we cannot support DAX */ > > > + return generic_file_read_iter(iocb, to); > > > > Is this not also racy, since we've just dropped the inode lock? What's to > > prevent this sequence? > > > > Thread 0 Thread 1 > > -------- -------- > > ext4_file_read_iter() > > IS_DAX() returns true > > changes S_DAX to false > > ext4_dax_read_iter() > > inode_lock_shared() > > IS_DAX() returns false > > inode_unlock_shared() > > changes S_DAX to true > > generic_file_read_iter() on a DAX inode > > > > > > Or are we okay in this scenario? > > Yup, I'm aware of this. The real problem is that there's no way to > serialize with buffered reads for ext4 (they take only page locks) so > currently you can have buffered reads in flight when inode gets switched to > DAX mode. I agree there is a potential for breakage and it needs to be > resolved eventually but the problem is not new and these patches don't make > it really any worse so I just somewhat fixed it up by patch 2/11 and left > full solution to a separate patch set. Fair enough. You can add: Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html