Re: [PATCH 0/4] implement vectored registered buffers for sendzc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/24/24 16:45, Jens Axboe wrote:
On 10/24/24 9:29 AM, Pavel Begunkov wrote:
On 10/23/24 14:52, Jens Axboe wrote:
On 10/22/24 8:38 PM, Pavel Begunkov wrote:
Allow registered buffers to be used with zerocopy sendmsg, where the
passed iovec becomes a scatter list into the registered buffer
specified by sqe->buf_index. See patches 3 and 4 for more details.

To get performance out of it, it'll need a bit more work on top for
optimising allocations and cleaning up send setups. We can also
implement it for non zerocopy variants and reads/writes in the future.

Tested by enabling it in test/send-zerocopy.c, which checks payloads,
and exercises lots of corner cases, especially around send sizes,
offsets and non aligned registered buffers.

Just for the edification of the list readers, Pavel and I discussed this
a bit last night. There's a decent amount of overlap with the send zc
provided + registered buffers work that I did last week, but haven't
posted yet. It's here;

https://git.kernel.dk/cgit/linux/log/?h=io_uring-sendzc-provided

in terms of needing and using both bvec and iovec in the array, and
having the suitable caching for the arrays rather than needing a full
alloc + free every time.

And as I mentioned, that can be well done in-place (in the same
array) as one option.

And that would be the way to do it, like I mentioned as well, that is
how my first iteration of the above did it too. But since this one just
needs to end up with an array of bvec, it was pointless for my series to
do the iovec import and only then turn it into bvecs.

So somewhat orthogonal, as the use cases aren't exactly the same. One
deals with iovecs out of necessity, the other one only previously did as
a matter of convenience to reuse existing iovec based helpers.

The send zc part can map into bvecs upfront and hence don't need the
iovec array storage at the same time, which this one does as the sendmsg
zc opcode needs to import an iovec. But perhaps there's a way to still
unify the storage and retain the caching, without needing to come up
with something new.

I looked through. The first problem is that your thing consuming
entries from the ring, with iovecs it'd need to be reading it
from the user one by one. Considering allocations in your helpers,
that would turn it into a bunch of copy_from_user, and it might
just easier and cleaner to copy the entire iovec.

I think for you case, incremental import would be the way to go. Eg
something ala:

if (!user_access_begin(user_iovec, nr_vecs * sizeof(*user_iovec)))
	return -EFAULT;

Is it even legal? I don't know how it's implemented specifically
but I assume there can be any kind of magic, e.g. intercepting
page faults. I'd definitely prefer to avoid anything but the simplest
handling like read from/write to memory, e.g. no allocations.


bv = kmsg->bvec;
for_each_iov {
	struct iovec iov;

	unsafe_get_user(iov.iov_base, &user_iovec->iov_base, foo);
	unsafe_get_user(iov.iov_len, &user_iovec->iov_len, foo);

	import_to_bvec(bv, &iov);

	user_iovec++;
	bv++;
}

if it can be done at prep time, because then there's no need to store
the iovec at all, it's already stable, just in bvecs. And this avoids
overlapping iovec/bvec memory, and it avoids doing two iterations for
import. Purely a poc, just tossing out ideas.

But I haven't looked too closely at your series yet. In any case,
whatever ends up working for you, will most likely be find for the
bundled zerocopy send (non-vectored) as well, and I can just put it on
top of that.

And you just made one towards delaying the imu resolution, which
is conceptually the right thing to do because of the mess with
links, just like it is with fixed files. That's why it need to
copy the iovec at the prep stage and resolve at the issue time.

Indeed, prep time is certainly the place to do it. And the above
incremental import should work fine then, as we won't care abot
user_iovec being stable being prep.

Seems like you're agreeing but then stating the opposite, there
is some confusion. I'm saying that IMHO the right API wise way
is resolving an imu at issue time, just like it's done for fixed
files, and what your recent series did for send zc.

--
Pavel Begunkov




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux