Re: [PATCH 6/7] io_uring/kbuf: add support for mapping type KBUF_MODE_BVEC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/24/24 16:27, Jens Axboe wrote:
On 10/24/24 9:22 AM, Pavel Begunkov wrote:
On 10/23/24 17:07, Jens Axboe wrote:
The provided buffer helpers always map to iovecs. Add a new mode,
KBUF_MODE_BVEC, which instead maps it to a bio_vec array instead. For
use with zero-copy scenarios, where the caller would want to turn it
into a bio_vec anyway, and this avoids first iterating and filling out
and iovec array, only for the caller to then iterate it again and turn
it into a bio_vec array.

Since it's now managing both iovecs and bvecs, change the naming of
buf_sel_arg->nr_iovs member to nr_vecs instead.

Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
---
   io_uring/kbuf.c | 170 +++++++++++++++++++++++++++++++++++++++++++-----
   io_uring/kbuf.h |   9 ++-
   io_uring/net.c  |  10 +--
   3 files changed, 165 insertions(+), 24 deletions(-)

diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 42579525c4bd..10a3a7a27e9a 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
...
+static struct io_mapped_ubuf *io_ubuf_from_buf(struct io_ring_ctx *ctx,
+                           u64 addr, unsigned int *offset)
+{
+    struct io_mapped_ubuf *imu;
+    u16 idx;
+
+    /*
+     * Get registered buffer index and offset, encoded into the
+     * addr base value.
+     */
+    idx = addr & ((1ULL << IOU_BUF_REGBUF_BITS) - 1);
+    addr >>= IOU_BUF_REGBUF_BITS;
+    *offset = addr  & ((1ULL << IOU_BUF_OFFSET_BITS) - 1);

There are two ABI questions with that. First why not use just
user addresses instead of offsets? It's more consistent with
how everything else works. Surely it could've been offsets for
all registered buffers ops from the beggining, but it's not.

How would that work? You need to pass in addr + buffer index for that.

I guess it depends on the second part then, that is if you
want to preserve the layout, in which case you can just use
sqe->buf_index

The usual approach is doing that, and then 'addr' tells you the offset
within the buffer, eg you can just do a subtraction to get your offset.
But you can't pass in both addr + index in a provided buffer, which is
why it's using buf->addr to encode index + offset for that, rather than
rely on the addr for the offset too.

The alternative obviously is to just do the 'addr' and have that be both
index and offset, in which case you'd need to lookup the buffer. And
that's certainly a no-go.

--
Pavel Begunkov




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux