On 10/24/24 9:40 AM, Pavel Begunkov wrote: > On 10/24/24 16:27, Jens Axboe wrote: >> On 10/24/24 9:22 AM, Pavel Begunkov wrote: >>> On 10/23/24 17:07, Jens Axboe wrote: >>>> The provided buffer helpers always map to iovecs. Add a new mode, >>>> KBUF_MODE_BVEC, which instead maps it to a bio_vec array instead. For >>>> use with zero-copy scenarios, where the caller would want to turn it >>>> into a bio_vec anyway, and this avoids first iterating and filling out >>>> and iovec array, only for the caller to then iterate it again and turn >>>> it into a bio_vec array. >>>> >>>> Since it's now managing both iovecs and bvecs, change the naming of >>>> buf_sel_arg->nr_iovs member to nr_vecs instead. >>>> >>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>>> --- >>>> io_uring/kbuf.c | 170 +++++++++++++++++++++++++++++++++++++++++++----- >>>> io_uring/kbuf.h | 9 ++- >>>> io_uring/net.c | 10 +-- >>>> 3 files changed, 165 insertions(+), 24 deletions(-) >>>> >>>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c >>>> index 42579525c4bd..10a3a7a27e9a 100644 >>>> --- a/io_uring/kbuf.c >>>> +++ b/io_uring/kbuf.c >>> ... >>>> +static struct io_mapped_ubuf *io_ubuf_from_buf(struct io_ring_ctx *ctx, >>>> + u64 addr, unsigned int *offset) >>>> +{ >>>> + struct io_mapped_ubuf *imu; >>>> + u16 idx; >>>> + >>>> + /* >>>> + * Get registered buffer index and offset, encoded into the >>>> + * addr base value. >>>> + */ >>>> + idx = addr & ((1ULL << IOU_BUF_REGBUF_BITS) - 1); >>>> + addr >>= IOU_BUF_REGBUF_BITS; >>>> + *offset = addr & ((1ULL << IOU_BUF_OFFSET_BITS) - 1); >>> >>> There are two ABI questions with that. First why not use just >>> user addresses instead of offsets? It's more consistent with >>> how everything else works. Surely it could've been offsets for >>> all registered buffers ops from the beggining, but it's not. >> >> How would that work? You need to pass in addr + buffer index for that. > > I guess it depends on the second part then, that is if you > want to preserve the layout, in which case you can just use > sqe->buf_index The whole point is to make provided AND registered buffers work together. And you can't pass in a buffer group ID _and_ a registered buffer index in the SQE. And for provided buffers, furthermore the point is that the buffer itself holds information about where to transfer to/from. Once you've added your buffer, you don't need to further track it, when it gets picked it has all the information on where the transfer occurs. -- Jens Axboe