David Howells wrote: > If sendmsg() is passed MSG_SPLICE_PAGES and is given a buffer that contains > some data that's resident in the slab, copy it rather than returning EIO. > This can be made use of by a number of drivers in the kernel, including: > iwarp, ceph/rds, dlm, nvme, ocfs2, drdb. It could also be used by iscsi, > rxrpc, sunrpc, cifs and probably others. > > skb_splice_from_iter() is given it's own fragment allocator as > page_frag_alloc_align() can't be used because it does no locking to prevent > parallel callers from racing. alloc_skb_frag() uses a separate folio for > each cpu and locks to the cpu whilst allocating, reenabling cpu migration > around folio allocation. > > This could allocate a whole page instead for each fragment to be copied, as > alloc_skb_with_frags() would do instead, but that would waste a lot of > space (most of the fragments look like they're going to be small). > > This allows an entire message that consists of, say, a protocol header or > two, a number of pages of data and a protocol footer to be sent using a > single call to sock_sendmsg(). > > The callers could be made to copy the data into fragments before calling > sendmsg(), but that then penalises them if MSG_SPLICE_PAGES gets ignored. > > Signed-off-by: David Howells <dhowells@xxxxxxxxxx> > cc: Alexander Duyck <alexander.duyck@xxxxxxxxx> > cc: Eric Dumazet <edumazet@xxxxxxxxxx> > cc: "David S. Miller" <davem@xxxxxxxxxxxxx> > cc: David Ahern <dsahern@xxxxxxxxxx> > cc: Jakub Kicinski <kuba@xxxxxxxxxx> > cc: Paolo Abeni <pabeni@xxxxxxxxxx> > cc: Jens Axboe <axboe@xxxxxxxxx> > cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > cc: Menglong Dong <imagedong@xxxxxxxxxxx> > cc: netdev@xxxxxxxxxxxxxxx > --- > > Notes: > ver #2) > - Fix parameter to put_cpu_ptr() to have an '&'. > > include/linux/skbuff.h | 5 ++ > net/core/skbuff.c | 171 ++++++++++++++++++++++++++++++++++++++++- > 2 files changed, 173 insertions(+), 3 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 91ed66952580..0ba776cd9be8 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -5037,6 +5037,11 @@ static inline void skb_mark_for_recycle(struct sk_buff *skb) > #endif > } > > +void *alloc_skb_frag(size_t fragsz, gfp_t gfp); > +void *copy_skb_frag(const void *s, size_t len, gfp_t gfp); > +ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, > + ssize_t maxsize, gfp_t gfp); > + > ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, > ssize_t maxsize, gfp_t gfp); > duplicate declaration (no need to respin just for this, imho)