On Wed 05-06-19 11:25:51, Dave Chinner wrote: > On Mon, Jun 03, 2019 at 03:21:55PM +0200, Jan Kara wrote: > > Hole puching currently evicts pages from page cache and then goes on to > > remove blocks from the inode. This happens under both i_mmap_sem and > > i_rwsem held exclusively which provides appropriate serialization with > > racing page faults. However there is currently nothing that prevents > > ordinary read(2) from racing with the hole punch and instantiating page > > cache page after hole punching has evicted page cache but before it has > > removed blocks from the inode. This page cache page will be mapping soon > > to be freed block and that can lead to returning stale data to userspace > > or even filesystem corruption. > > > > Fix the problem by protecting reads as well as readahead requests with > > i_mmap_sem. > > > > CC: stable@xxxxxxxxxxxxxxx > > Reported-by: Amir Goldstein <amir73il@xxxxxxxxx> > > Signed-off-by: Jan Kara <jack@xxxxxxx> > > --- > > fs/ext4/file.c | 35 +++++++++++++++++++++++++++++++---- > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > > index 2c5baa5e8291..a21fa9f8fb5d 100644 > > --- a/fs/ext4/file.c > > +++ b/fs/ext4/file.c > > @@ -34,6 +34,17 @@ > > #include "xattr.h" > > #include "acl.h" > > > > +static ssize_t ext4_file_buffered_read(struct kiocb *iocb, struct iov_iter *to) > > +{ > > + ssize_t ret; > > + struct inode *inode = file_inode(iocb->ki_filp); > > + > > + down_read(&EXT4_I(inode)->i_mmap_sem); > > + ret = generic_file_read_iter(iocb, to); > > + up_read(&EXT4_I(inode)->i_mmap_sem); > > + return ret; > > Isn't i_mmap_sem taken in the page fault path? What makes it safe > to take here both outside and inside the mmap_sem at the same time? > I mean, the whole reason for i_mmap_sem existing is that the inode > i_rwsem can't be taken both outside and inside the i_mmap_sem at the > same time, so what makes the i_mmap_sem different? Drat, you're right that read path may take page fault which will cause lock inversion with mmap_sem. Just my xfstests run apparently didn't trigger this as I didn't get any lockdep splat. Thanks for catching this! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR