On Wed, Nov 18, 2020 at 08:26:50AM -0700, Jens Axboe wrote: > On 11/18/20 12:19 AM, Dave Chinner wrote: > > On Tue, Nov 17, 2020 at 03:17:18PM -0700, Jens Axboe wrote: > >> If we've successfully transferred some data in __iomap_dio_rw(), > >> don't mark an error for a latter segment in the dio. > >> > >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > >> > >> --- > >> > >> Debugging an issue with io_uring, which uses IOCB_NOWAIT for the > >> IO. If we do parts of an IO, then once that completes, we still > >> return -EAGAIN if we ran into a problem later on. That seems wrong, > >> normal convention would be to return the short IO instead. For the > >> -EAGAIN case, io_uring will retry later parts without IOCB_NOWAIT > >> and complete it successfully. > > > > So you are getting a write IO that is split across an allocated > > extent and a hole, and the second mapping is returning EAGAIN > > because allocation would be required? This sort of split extent IO > > is fairly common, so I'm not sure that splitting them into two > > separate IOs may not be the best approach. > > The case I seem to be hitting is this one: > > if (iocb->ki_flags & IOCB_NOWAIT) { > if (filemap_range_has_page(mapping, pos, end)) { > ret = -EAGAIN; > goto out_free_dio; > } > flags |= IOMAP_NOWAIT; > } > > in __iomap_dio_rw(), which isn't something we can detect upfront like IO > over a multiple extents... This specific situation cannot result in the partial IO behaviour you described. It is an -upfront check- that is done before any IO is mapped or issued so results in the entire IO being skipped and we don't get anywhere near the code you changed. IOWs, this doesn't explain why you saw a partial IO, or why changing partial IO return values avoids -EAGAIN from a range we apparently just did a partial IO over and -didn't have page cache pages- sitting over it. Can you provide an actual event trace of the IOs in question that are failing in your tests (e.g. from something like `trace-cmd record -e xfs_file\* -e xfs_i\* -e xfs_\*write -e iomap\*` over the sequential that reproduces the issue) so that there's no ambiguity over how this problem is occurring in your systems? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx