On Tue, Jul 16, 2013 at 12:33 PM, Ben Myers <bpm@xxxxxxx> wrote: >> > >> > And looking more at that, I'm actually starting to think this is an >> > XFS locking problem. XFS really should not call back to splice while >> > holding the inode lock. .. that was misleading, normally "inode lock" would be i_lock, but here I meant the XFS-specific i_iolock. But you clearly picked up on it: > CPU0 CPU1 CPU2 > ---- ---- ---- > lock(&sig->cred_guard_mutex); > lock(&pipe->mutex/1); > lock(&(&ip->io_lock)->mr_lock); > lock(&(&ip->io_lock)->mr_lock); > lock(&sig->cred_guard_mutex); > lock(&pipe->mutex/1); Yup. > I agree that fixing this in XFS seems to be the most reasonable plan, > and Dave's approach looks ok to me. Seems like patch 1 should go > through Al's tree, but we'll also need to commit it to the xfs tree > prerequisite to patch 2. Btw, is there some reason why XFS couldn't just use generic_file_splice_read() directly? I'm not arguing against Dave's patches, since the splice_write case would seem to want them regardless, but I'm not even seeing why XFS couldn't just avoid the locking for the splice_read case entirely..And the generic file-splice read function does all that read-ahead stuff and should actually be better for performance. And because it does it from the page cache, it can avoid the locking issues (because it gets the splice pipe lock later, just holding on to the page references). And splice has mmap semantics - the whole point of splice is about moving pages around, after all - so I *think* your current "xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);" is actually over-serialization. The pages will be shared by the pipe buffers anyway, so splicing by definition doesn't imply full data serialization (because the reading of the data itself will happen much later). So the per-page serialization done by readpage() should already be sufficient, no? I dunno. Maybe there's something I'm missing. XFS does seem to wrap all the other generic functions in that lock too, but since mmap() etc clearly have to be able to get things one page at a time (without any wrapping at higher layers), I'm just wondering whether splice_read might not be able to avoid it. Linus _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs