On Mon, May 14, 2018 at 05:36:24PM +0200, Andreas Gruenbacher wrote: > According to xfstest generic/240, applications see, to expect direct I/O > writes to either complete as a whole or to fail; short direct I/O writes > are apparently not appreciated. This means that when only part of an > asynchronous direct I/O write succeeds, we can either fail the entire > write, or we can wait wait for the partial write to complete and retry > the remaining write using buffered I/O. The old __blockdev_direct_IO > helper has code for waiting for partial writes to complete; the new > iomap_dio_rw iomap helper does not. > > The above mentioned fallback mode is used by gfs2, which doesn't allow > block allocations under direct I/O to avoid taking cluster-wide > exclusive locks. As a consequence, an asynchronous direct I/O write to > a file range that ends in a hole will result in a short write. When > that happens, we want to retry the remaining write using buffered I/O. > > To allow that, change iomap_dio_rw to wait for short direct I/O writes > like __blockdev_direct_IO does instead of returning -EIOCBQUEUED. > > This fixes xfstest generic/240 on gfs2. The code looks pretty racy to me. Why would gfs2 cause a short direct I/O write to start with? I suspect that is where the problem that needs fixing is burried.