Re: [PATCH 6/7] io_uring/kbuf: add support for mapping type KBUF_MODE_BVEC

Jens Axboe <axboe@xxxxxxxxx> · Thu, 24 Oct 2024 09:49:07 -0600

On 10/24/24 9:40 AM, Pavel Begunkov wrote:
> On 10/24/24 16:27, Jens Axboe wrote:
>> On 10/24/24 9:22 AM, Pavel Begunkov wrote:
>>> On 10/23/24 17:07, Jens Axboe wrote:
>>>> The provided buffer helpers always map to iovecs. Add a new mode,
>>>> KBUF_MODE_BVEC, which instead maps it to a bio_vec array instead. For
>>>> use with zero-copy scenarios, where the caller would want to turn it
>>>> into a bio_vec anyway, and this avoids first iterating and filling out
>>>> and iovec array, only for the caller to then iterate it again and turn
>>>> it into a bio_vec array.
>>>>
>>>> Since it's now managing both iovecs and bvecs, change the naming of
>>>> buf_sel_arg->nr_iovs member to nr_vecs instead.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>>> ---
>>>>    io_uring/kbuf.c | 170 +++++++++++++++++++++++++++++++++++++++++++-----
>>>>    io_uring/kbuf.h |   9 ++-
>>>>    io_uring/net.c  |  10 +--
>>>>    3 files changed, 165 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>>>> index 42579525c4bd..10a3a7a27e9a 100644
>>>> --- a/io_uring/kbuf.c
>>>> +++ b/io_uring/kbuf.c
>>> ...
>>>> +static struct io_mapped_ubuf *io_ubuf_from_buf(struct io_ring_ctx *ctx,
>>>> +                           u64 addr, unsigned int *offset)
>>>> +{
>>>> +    struct io_mapped_ubuf *imu;
>>>> +    u16 idx;
>>>> +
>>>> +    /*
>>>> +     * Get registered buffer index and offset, encoded into the
>>>> +     * addr base value.
>>>> +     */
>>>> +    idx = addr & ((1ULL << IOU_BUF_REGBUF_BITS) - 1);
>>>> +    addr >>= IOU_BUF_REGBUF_BITS;
>>>> +    *offset = addr  & ((1ULL << IOU_BUF_OFFSET_BITS) - 1);
>>>
>>> There are two ABI questions with that. First why not use just
>>> user addresses instead of offsets? It's more consistent with
>>> how everything else works. Surely it could've been offsets for
>>> all registered buffers ops from the beggining, but it's not.
>>
>> How would that work? You need to pass in addr + buffer index for that.
> 
> I guess it depends on the second part then, that is if you
> want to preserve the layout, in which case you can just use
> sqe->buf_index

The whole point is to make provided AND registered buffers work
together. And you can't pass in a buffer group ID _and_ a registered
buffer index in the SQE.

And for provided buffers, furthermore the point is that the buffer
itself holds information about where to transfer to/from. Once you've
added your buffer, you don't need to further track it, when it gets
picked it has all the information on where the transfer occurs.

-- 
Jens Axboe