Re: [RFC PATCH v2 30/48] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bernard Metzler <BMT@xxxxxxxxxxxxxx> wrote:

> > When transmitting data, call down into TCP using a single sendmsg with
> > MSG_SPLICE_PAGES to indicate that content should be spliced rather than
> > performing several sendmsg and sendpage calls to transmit header, data
> > pages and trailer.
> > 
> > To make this work, the data is assembled in a bio_vec array and attached to
> > a BVEC-type iterator.  The header and trailer (if present) are copied into
> > page fragments that can be freed with put_page().
> 
> I like it a lot if it still keeps zero copy sendpage() semantics for
> the cases the driver can make use of data transfers w/o copy. 
> Is 'msg.msg_flags |= MSG_SPLICE_PAGES' doing that magic?

Yes.  MSG_SPLICE_PAGES indicates that you want the socket to retain your
buffer and pass it directly to the device.  Note that it's just a hint,
however, pages that are unspliceable (eg. they belong to the slab) will get
copied into a page fragment instead.  Further, if the device cannot support a
vector, then the hint can be ignored and all the data can be copied as normal.

> 'splicing' suggest just merging pages to me.

'splicing' as in what the splice system call does.

Unfortunately, MSG_ZEROCOPY is already a (different) thing.

> It would simplify the transmit code path substantially, also getting
> rid of kmap_local_page()/kunmap_local() sequences for multi-fragment
> sendmsg()'s.

If the ITER_ITERLIST iterator is accepted, then siw would be able to do mix
KVEC and BVEC iterators, e.g. what I did for sunrpc here:

	https://lore.kernel.org/linux-fsdevel/20230329141354.516864-42-dhowells@xxxxxxxxxx/T/#u

This means that in siw_tx_hdt() where I made it copy data into page fragments
using page_frag_memdup() and attach that to a bvec:

	hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent;
	h = page_frag_memdup(NULL, hdr, hdr_len, GFP_NOFS, ULONG_MAX);
	if (!h)
		goto done;
	bvec_set_virt(&bvec[0], h, hdr_len);
	seg = 1;

it can just set up a kvec instead.

Unfortunately, it's not so easy to get rid of all of the kmap'ing as we need
to do some of it to do the hashing.

David





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux