On 1/9/2015 9:23 PM, Chuck Lever wrote:
Most NFS RPCs place large payload arguments at the end of the RPC header (eg, NFSv3 WRITE). For NFSv3 WRITE and SYMLINK, RPC/RDMA sends the complete RPC header inline, and the payload argument in a read list. One important case is not like this, however. NFSv4 WRITE compounds can have an operation after the WRITE operation. The proper way to convey an NFSv4 WRITE is to place the GETATTR inline, but _after_ the read list position. (Note Linux clients currently do not do this, but they will be changed to do it in the future). The receiver could put trailing inline content in the XDR tail buffer. But the Linux server's NFSv4 compound processing does not consider the XDR tail buffer. So, move trailing inline content to the end of the page list. This presents the incoming compound to upper layers the same way the socket code does.
Would this memcpy be saved if you just posted a larger receive buffer and the client would used it "really inline" as part of it's post_send? I'm just trying to understand if this complicated logic is worth the extra bytes of a larger recv buffer you are saving... Will this code path happen a lot? If so you might get some overhead you may want to avoid. I may not see the full picture here... Just thought I'd ask... Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html