On Thu, Sep 08, 2016 at 11:12:33AM -0700, Linus Torvalds wrote: > So the XFS model *requires* that XFS_IOLOCK to be outside all the operations. > > But then in iter_file_splice_write we have the other ordering. > > Now, xfs could do a wrapper for ->splice_write() like it used to, and > have that same "take the xfs lock around the call to > iter_file_splice_write(). That was my first obvious patch. > > I threw it out because that's garbage too: then we end up doing > ->write_iter(), which takes the xfs_rw_ilock() again, and would > immediately deadlock *there* instead. > > So the problem really is that the vfs layer seems to simply not allow > the filesystem to do any locking around the generic page cache helper > functions. And XFS really wants to do that. Why *have* ->splice_read() there at all? Let's use its ->read_iter(), where it will take its lock as it always did for read. All we need is a variant of __generic_file_splice_read() that would pass a new kind of iov_iter down to filesystem's own ->read_iter(). And let that guy do whatever locking it wants. It will end up doing a sequence of copy_page_to_iter() and, possibly, copy_to_iter() (XFS one would only do the former). So let's add an iov_iter flavour that would simply grab a reference to page passed to copy_page_to_iter() and allocate-and-copy for copy_to_iter(). As the result, you'll get an array of <page, offset, count> triples - same as you would from the existing __generic_file_splice_read(). Pages already uptodate, with all readahead logics done as usual, etc. What else do we need? Just feed the resulting triples into the pipe and that's it. Sure, they can get stale by truncate/punchhole/whatnot. So they can right after we return from xfs_file_splice_read()... Moreover, I don't see why we need to hold pipe lock the actual call of ->read_iter(). Right now we only grab it for "feed into pipe buffers" part. Objections? -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html