Eric Dumazet <edumazet@xxxxxxxxxx> wrote: > > Here's the first tranche of patches towards providing a MSG_SPLICE_PAGES > > internal sendmsg flag that is intended to replace the ->sendpage() op with > > calls to sendmsg(). MSG_SPLICE is a hint that tells the protocol that it > > should splice the pages supplied if it can and copy them if not. > > > > I find this patch series quite big/risky for 6.4 If you want me to hold this till after the merge window, that's fine. > Can you spell out why we need "unspliceable pages support" ? > This seems to add quite a lot of code in fast paths. The patches to copy unspliceable pages (patches 6, 14 and 19) only really add to the MSG_SPLICE_PAGES path - I don't know whether you count this as a fast path or not. (Or are you objecting to MSG_SPLICE_PAGES and getting rid of sendpage in general?) What I'm trying to do with this aspect is twofold: Firstly, I'm trying to make it such that the layer above can send each message in a single sendmsg() if possible. This is possible with sunrpc and siw, for example, but currently they make a whole bunch of separate calls into the transport layer - typically at least three for header, body, trailer. Secondly, I'm trying to avoid a double copy. The layer above TCP/UDP/etc (sunrpc[*], siw, etc.) needs to glue protocol bits on either end of the message body and it may have this data in the slab or on the stack - which it would then need to copy into a page fragment so that it can be zero-copied. However, if the device can handle this or we don't have sufficient frags, the network layer may decide to copy it anyway - I'm not sure how the higher layer can determine this. It just seems there are fewer places this is required if it can be done in the network protocol. Note that userspace cannot make use of this since they're not allowed to set MSG_SPLICE_PAGES. However, I have kept these bits separate and discard them if it's considered a bad idea and that MSG_SPLICE_PAGES should, say, give an error in such a case. David [*] sunrpc, at least, seems to store the header and trailer in zerocopyable pages, but has an additional bit on the front that's not.