> -----Original Message----- > From: linux-nfs-owner@xxxxxxxxxxxxxxx [mailto:linux-nfs- > owner@xxxxxxxxxxxxxxx] On Behalf Of Jeff Layton > Sent: Friday, June 14, 2013 3:22 PM > To: Sandeep Joshi > Cc: J. Bruce Fields; linux-nfs@xxxxxxxxxxxxxxx > Subject: Re: why does nfsd write not use splice > > On Fri, 14 Jun 2013 17:39:12 +0530 > Sandeep Joshi <sanjos100@xxxxxxxxx> wrote: > > > On Wed, Jun 12, 2013 at 10:16 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> > wrote: > > > > > > On Wed, Jun 12, 2013 at 09:51:09PM +0530, Sandeep Joshi wrote: > > > > Splice can be implemented independent of RDMA. It is supposed to > > > > transfer pages between two file descriptors. I found some > > > > postings on lkml from > > > > 2006 where Linus says it is quite possible to splice from a socket > > > > to a file. > > > > > > > > See the paragraph: > > > > " For filesystems, splice support tends to be really easy (both > > > > read and write). For other things, it depends a bit. But unlike > > > > sendfile(), it really is quite possible to splice _from_ a socket > > > > too, not just _to_ a socket. But no, that case hasn't been written yet." > > > > http://yarchive.net/comp/linux/splice.html > > > > > > > > Larry McVoy's 1997 proposal for adding splice support to the > > > > kernel can be read at > > > > ftp.tux.org/pub/sites/ftp.bitmover.com/pub/*splice*.*ps*.gz<http:/ > > > > /ftp.tux.org/pub/sites/ftp.bitmover.com/pub/splice.ps.gz> > > > > > > > > Perhaps I should have opened this thread on lkml to determine if > > > > splice from socket to file is still feasible.. > > > > > > Right, the thing is, nfsd reads the rpc request from the socket into > > > its own buffers before it parses it. If you want to move the data > > > directly out of the network buffers into the page cache, then you > > > have to know at what point the write data starts in the > > > request--which I believe will mean doing the xdr parsing (and gss > > > decryption if necessary) as the request comes in off the wire. > > > > > > That sounds like a lot of work and even if you have someone willing > > > to do the work they'd also need to justify that it's worth it. > > > > > > RDMA may have some protocol support that simplifies this, I don't know. > > > > > > --b. > > > > Hi Bruce, > > > > > nfsd reads the rpc request from the socket into its own buffers before it > parses it. > > > > I am not intimate with the gss code but do you think the > > svc_rqst->rq_pages[] can be spliced ? > > > > Probably not in its current form. The problem is one of alignment. You need > to know where the write data actually starts before doing the receive off the > socket, so you can make sure that it ends up in the correct spot in the pages > you're going to splice in. > > There's also the problem of what to do about WRITE requests that contain > data that isn't page aligned or that's shorter than a page... Finally, there is the minor problem that the data that is actually received by the socket may be encrypted, or may need to be checksummed (krb5i) _before_ you can apply it to the file. That is not a particularly good fit for splice(). Trond -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html