On Wed, Jan 13, 2021 at 10:00:37AM +0200, Avi Kivity wrote: > On 1/13/21 12:13 AM, Dave Chinner wrote: > > On Tue, Jan 12, 2021 at 10:01:35AM +0200, Avi Kivity wrote: > > > On 1/12/21 3:07 AM, Dave Chinner wrote: > > > > Hi folks, > > > > > > > > This is the XFS implementation on the sub-block DIO optimisations > > > > for written extents that I've mentioned on #xfs and a couple of > > > > times now on the XFS mailing list. > > > > > > > > It takes the approach of using the IOMAP_NOWAIT non-blocking > > > > IO submission infrastructure to optimistically dispatch sub-block > > > > DIO without exclusive locking. If the extent mapping callback > > > > decides that it can't do the unaligned IO without extent > > > > manipulation, sub-block zeroing, blocking or splitting the IO into > > > > multiple parts, it aborts the IO with -EAGAIN. This allows the high > > > > level filesystem code to then take exclusive locks and resubmit the > > > > IO once it has guaranteed no other IO is in progress on the inode > > > > (the current implementation). > > > > > > Can you expand on the no-splitting requirement? Does it involve only > > > splitting by XFS (IO spans >1 extents) or lower layers (RAID)? > > XFS only. > > > Ok, that is somewhat under control as I can provide an extent hint, and wish > really hard that the filesystem isn't fragmented. > > > > > The reason I'm concerned is that it's the constraint that the application > > > has least control over. I guess I could use RWF_NOWAIT to avoid blocking my > > > main thread (but last time I tried I'd get occasional EIOs that frightened > > > me off that). > > Spurious EIO from RWF_NOWAIT is a bug that needs to be fixed. DO you > > have any details? > > > > I reported it in [1]. It's long since gone since I disabled RWF_NOWAIT. It > was relatively rare, sometimes happening in continuous integration runs that > take hours, and sometimes not. > > > I expect it's fixed by now since io_uring relies on it. Maybe I should turn > it on for kernels > some_random_version. > > > [1] https://lore.kernel.org/lkml/9bab0f40-5748-f147-efeb-5aac4fd44533@xxxxxxxxxxxx/t/#u Yeah, as I thought. Usage of REQ_NOWAIT with filesystem based IO is simply broken - it causes spurious IO failures to be reported to IO completion callbacks and so are very difficult to track and/or retry. iomap does not use REQ_NOWAIT at all, so you should not ever see this from XFS or ext4 DIO anymore... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx