Dave Chinner <david@xxxxxxxxxxxxx> writes: > On Mon, Oct 28, 2024 at 06:39:36AM +0530, Ritesh Harjani wrote: >> >> Hi Dave, >> >> Dave Chinner <david@xxxxxxxxxxxxx> writes: >> >> > On Fri, Oct 25, 2024 at 09:15:53AM +0530, Ritesh Harjani (IBM) wrote: >> >> iomap will not return -ENOTBLK in case of dio atomic writes. But let's >> >> also add a WARN_ON_ONCE and return -EIO as a safety net. >> >> >> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> >> >> --- >> >> fs/ext4/file.c | 10 +++++++++- >> >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/fs/ext4/file.c b/fs/ext4/file.c >> >> index f9516121a036..af6ebd0ac0d6 100644 >> >> --- a/fs/ext4/file.c >> >> +++ b/fs/ext4/file.c >> >> @@ -576,8 +576,16 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) >> >> iomap_ops = &ext4_iomap_overwrite_ops; >> >> ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops, >> >> dio_flags, NULL, 0); >> >> - if (ret == -ENOTBLK) >> >> + if (ret == -ENOTBLK) { >> >> ret = 0; >> >> + /* >> >> + * iomap will never return -ENOTBLK if write fails for atomic >> >> + * write. But let's just add a safety net. >> >> + */ >> >> + if (WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC)) >> >> + ret = -EIO; >> >> + } >> > >> > Why can't the iomap code return EIO in this case for IOCB_ATOMIC? >> > That way we don't have to put this logic into every filesystem. >> >> This was origially intended as a safety net hence the WARN_ON_ONCE. >> Later Darrick pointed out that we still might have an unconverted >> condition in iomap which can return ENOTBLK for DIO atomic writes (page >> cache invalidation). > > Yes. That's my point - iomap knows that it's an atomic write, it > knows that invalidation failed, and it knows that there is no such > thing as buffered atomic writes. So there is no possible fallback > here, and it should be returning EIO in the page cache invalidation > failure case and not ENOTBLK. > Sorry my bad. I think I might have looked into a different version of the code earlier. So the current patch from John already takes care of the condition where if the page cache invalidation fails we don't return -ENOTBLK [1] [1]: https://lore.kernel.org/linux-xfs/Zxnp8bma2KrMDg5m@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m3664bbe00287d98caa690bb04f51d0ef164f52b3 >> You pointed it right that it should be fixed in iomap. However do you >> think filesystems can still keep this as safety net (maybe no need of >> WARN_ON_ONCE). > > I don't see any point in adding "impossible to hit" checks into > filesystems just in case some core infrastructure has a bug > introduced.... > So even though we have taken care of that case from page cache invalidation code, however it can still happen if iomap_iter() ever returns -ENOTBLK. e.g. blk_start_plug(&plug); while ((ret = iomap_iter(&iomi, ops)) > 0) { iomi.processed = iomap_dio_iter(&iomi, dio); /* * We can only poll for single bio I/Os. */ iocb->ki_flags &= ~IOCB_HIPRI; } blk_finish_plug(&plug); /* * We only report that we've read data up to i_size. * Revert iter to a state corresponding to that as some callers (such * as the splice code) rely on it. */ if (iov_iter_rw(iter) == READ && iomi.pos >= dio->i_size) iov_iter_revert(iter, iomi.pos - dio->i_size); if (ret == -EFAULT && dio->size && (dio_flags & IOMAP_DIO_PARTIAL)) { if (!(iocb->ki_flags & IOCB_NOWAIT)) wait_for_completion = true; ret = 0; } /* magic error code to fall back to buffered I/O */ if (ret == -ENOTBLK) { wait_for_completion = true; ret = 0; } Reviewing the code paths there is a lot of ping pongs between core iomap and FS. So it's not just core iomap what we are talking about here. So I am still inclined towards having that check in place as a safety net. However - let me take some time to review some of this code paths please. I wanted to send this email mainly to mention the point that page cache invalidation case is already taken care in iomap for atomic writes, so there is no bug there. I will get back on rest of the cases after I have looked more closely at it. > -Dave. > > -- > Dave Chinner > david@xxxxxxxxxxxxx Thanks for the review! -ritesh