On Mon, Jun 1, 2020 at 4:16 PM Goldwyn Rodrigues <rgoldwyn@xxxxxxx> wrote: > > On 17:23 28/05, Darrick J. Wong wrote: > > On Thu, May 28, 2020 at 02:21:03PM -0500, Goldwyn Rodrigues wrote: > > > > > > Filesystems such as btrfs are unable to guarantee page invalidation > > > because pages could be locked as a part of the extent. Return zero > > > > Locked for what? filemap_write_and_wait_range should have just cleaned > > them off. > > > > > in case a page cache invalidation is unsuccessful so filesystems can > > > fallback to buffered I/O. This is similar to > > > generic_file_direct_write(). > > > > > > This takes care of the following invalidation warning during btrfs > > > mixed buffered and direct I/O using iomap_dio_rw(): > > > > > > Page cache invalidation failure on direct I/O. Possible data > > > corruption due to collision with buffered I/O! > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@xxxxxxxx> > > > > > > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c > > > index e4addfc58107..215315be6233 100644 > > > --- a/fs/iomap/direct-io.c > > > +++ b/fs/iomap/direct-io.c > > > @@ -483,9 +483,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, > > > */ > > > ret = invalidate_inode_pages2_range(mapping, > > > pos >> PAGE_SHIFT, end >> PAGE_SHIFT); > > > - if (ret) > > > - dio_warn_stale_pagecache(iocb->ki_filp); > > > - ret = 0; > > > + /* > > > + * If a page can not be invalidated, return 0 to fall back > > > + * to buffered write. > > > + */ > > > + if (ret) { > > > + if (ret == -EBUSY) > > > + ret = 0; > > > + goto out_free_dio; > > > > XFS doesn't fall back to buffered io when directio fails, which means > > this will cause a regression there. > > > > Granted mixing write types is bogus... > > > > I have not seen page invalidation failure errors on XFS, but what should > happen hypothetically if they do occur? Carry on with the direct I/O? > Would an error return like -ENOTBLK be better? It doesn't make much to me to emit the warning and then proceed to the direct IO write path anyway, as if nothing happened. If we are concerned about possible corruption, we should either return an error or fallback to buffered IO just like generic_file_direct_write() did, and not allow the possibility for corruptions. Btw, this is causing a regression in Btrfs now. The problem is that dio_warn_stale_pagecache() sets an EIO error in the inode's mapping: errseq_set(&inode->i_mapping->wb_err, -EIO); So the next fsync on the file will return that error, despite the fsync having completed successfully with any errors. Since patchset to make btrfs direct IO use iomap is already in Linus' tree, we need to fix this somehow. This makes generic/547 fail often for example - buffered write against file + direct IO write + fsync - the later returns -EIO. Thanks. > > -- > Goldwyn -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”