On Mon, Oct 31, 2016 at 08:46:42AM -0700, Christoph Hellwig wrote: > On Mon, Oct 31, 2016 at 10:14:28AM -0400, Brian Foster wrote: > > We've had reports of generic/095 causing XFS to BUG() in > > __xfs_get_blocks() due to the existence of delalloc blocks on a direct > > I/O read. generic/095 issues a mix of various types of I/O, including > > direct and memory mapped I/O to a single file. > > Can you explain the scenario in which case this happens in a little > more detail? The patch looks fine to me, but I'd really like to > understand how this happens. Sure... the case I reproduced is a race between a direct I/O read and a mapped write to a hole in a file. The direct read gets through xfs_file_dio_aio_read() and down to __xfs_get_blocks() while the region is still a hole. Before the xfs_bmapi_read() call from __xfs_get_blocks(), a mapped write occurs and allocates delalloc blocks in the associated file range. xfs_bmapi_read() then returns a delalloc mapping for a dio read and falls through to the BUG_ON(). FWIW, the specific reproducer was a tweaked variant of generic/095 to up the iodepth (1024), iodepth_batch (60), and numjobs (20) fio params. It was also on a ppc64 box with a 64k page size, so that might have also improved the chances of a race. This can be manufactured on demand with a hack to delay the dio read in __xfs_get_blocks(), however. E.g., stick a 'if (!create && direct) ssleep(N);' right before xfs_bmapi_read(), run a single block dio read to a hole in the file, and then a single block mapped write to the same offset as the read while it is delayed. Brian -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html