On Thu, Sep 08, 2016 at 11:12:33AM -0700, Linus Torvalds wrote: > On Thu, Sep 8, 2016 at 10:56 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > ... and brings back a lot of other crap. The thing is, pipe lock should > > be on the outside of everything fs might be taking, so that splice IO > > is the same as normal IO as far as filesystem locking is concerned. For > > the write side it had been done in that commit, for the read side it's yet > > to be done. > > Al, look at generic_file_splice_read(). > > The problem is that XFS takes its own > > xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > .. > xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); > > around all the generic file accessors. So for example, it doesn't use > "generic_file_read_iter()", it does > > STATIC ssize_t > xfs_file_buffered_aio_read( > struct kiocb *iocb, > struct iov_iter *to) > { > struct xfs_inode *ip = XFS_I(file_inode(iocb->ki_filp)); > ssize_t ret; > > trace_xfs_file_buffered_read(ip, iov_iter_count(to), iocb->ki_pos); > > xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > ret = generic_file_read_iter(iocb, to); > xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); > > return ret; > } > > and the exact same pattern holds for generic_file_splice_read(). > > So the XFS model *requires* that XFS_IOLOCK to be outside all the operations. > > But then in iter_file_splice_write we have the other ordering. > > Now, xfs could do a wrapper for ->splice_write() like it used to, and > have that same "take the xfs lock around the call to > iter_file_splice_write(). That was my first obvious patch. > > I threw it out because that's garbage too: then we end up doing > ->write_iter(), which takes the xfs_rw_ilock() again, and would > immediately deadlock *there* instead. That's what I first tried when this was first reported back in 3.18-rc0, and after a couple of other aborted attempts to work around the pipe_lock I came to the same conclusion: "That smells like a splice architecture bug. splice write puts the pipe lock outside the inode locks, but splice read puts the pipes locks *inside* the inode locks. " http://oss.sgi.com/archives/xfs/2014-10/msg00319.html > So the problem really is that the vfs layer seems to simply not allow > the filesystem to do any locking around the generic page cache helper > functions. And XFS really wants to do that. It's not an XFS specific problem: any filesystem that supports hole punch and it's fallocate() friends needs this high level splice IO exclusion as well. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html