Re: releasing result pages in svc_xprt_release()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jan 29, 2021, at 5:43 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Fri, Jan 29 2021, Chuck Lever wrote:
> 
>> Hi Neil-
>> 
>> I'd like to reduce the amount of page allocation that NFSD does,
>> and was wondering about the release and reset of pages in
>> svc_xprt_release(). This logic was added when the socket transport
>> was converted to use kernel_sendpage() back in 2002. Do you
>> remember why releasing the result pages is necessary?
>> 
> 
> Hi Chuck,
> as I recall, kernel_sendpage() (or sock->ops->sendpage() as it was
> then) takes a reference to the page and will hold that reference until
> the content has been sent and ACKed.  nfsd has no way to know when the
> ACK comes, so cannot know when the page can be re-used, so it must
> release the page and allocate a new one.
> 
> This is the price we pay for zero-copy, and I acknowledge that it is a
> real price.  I wouldn't be surprised if the trade-offs between
> zero-copy and single-copy change over time, and between different
> hardware.

Very interesting, thanks for the history! Two observations:

- I thought without MSG_DONTWAIT, the sendpage operation would be
total synchronous -- when the network layer was done with retransmissions,
it would unblock the caller. But that's likely a mistaken assumption
on my part. That could be why sendmsg is so much slower than sendpage
in this particular application.

- IIUC, nfsd_splice_read() replaces anonymous pages in rq_pages with
actual page cache pages. Those of course cannot be used to construct
subsequent RPC Replies, so that introduces a second release requirement.

So I have a way to make the first case unnecessary for RPC/RDMA. It
has a reliable Send completion mechanism. Sounds like releasing is
still necessary for TCP, though; maybe that could be done in the
xpo_release_rqst callback.

As far as nfsd_splice_read(), I had thought of moving those pages to
a separate array which would always be released. That would need to
deal with the transport requirements above.

If nothing else, I would like to add mention of these requirements
somewhere in the code too.

What's your opinion?


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux