Re: [PATCH 7/8] io_uring: support readv/writev with fixed buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Support readv/writev with fixed buffers, and introduce IOSQE_FIXED_BUFFER,
consistent with fixed files.

I don't like it at all, see issues below. The actual implementation would
be much uglier.

I propose you to split the series and push separately. Your first 6 patches
first, I don't have conceptual objections to them. Then registration sharing
(I still need to look it up). And then we can return to this, if you're not
yet convinced.

Ok. The sharing patch is actually the highest priority for us so it'd be great to know if you think it's in the right direction.

Should I submit them as they are or address your fixed_file_ref comments in Patch 4/8 as well? Would I need your prep patch beforehand?

+static ssize_t io_import_iovec_fixed(int rw, struct io_kiocb *req, void *buf,
+				     unsigned segs, unsigned fast_segs,
+				     struct iovec **iovec,
+				     struct iov_iter *iter)
+{
+	struct io_ring_ctx *ctx = req->ctx;
+	struct io_mapped_ubuf *imu;
+	struct iovec *iov;
+	u16 index, buf_index;
+	ssize_t ret;
+	unsigned long seg;
+
+	if (unlikely(!ctx->buf_data))
+		return -EFAULT;
+
+	ret = import_iovec(rw, buf, segs, fast_segs, iovec, iter);

Did you test it? import_iovec() does access_ok() against each iov_base,
which in your case are an index.

I used liburing:test/file-{register,update} as models for the equivalent buffer tests and they seem to work. I can send out the tests and the liburing changes if you want.

The fixed io test registers an empty iov table first:

	ret = io_uring_register_buffers(ring, iovs, UIO_MAXIOV);

It next updates the table with two actual buffers at offset 768:

        ret = io_uring_register_buffers_update(ring, 768, ups, 2);

It later uses the buffer at index 768 for writev similar to the file-register test (IOSQE_FIXED_BUFFER instead of IOSQE_FIXED_FILE):

        iovs[768].iov_base = (void *)768;
        iovs[768].iov_len = pagesize;


        io_uring_prep_writev(sqe, fd, iovs, 1, 0);
        sqe->flags |= IOSQE_FIXED_BUFFER;

        ret = io_uring_submit(ring);

Below is the iovec returned from

io_import_iovec_fixed()
-> io_import_vec()

{iov_base = 0x300 <dm_early_create+412>, iov_len = 4096}

+	if (ret < 0)
+		return ret;
+
+	iov = (struct iovec *)iter->iov;
+
+	for (seg = 0; seg < iter->nr_segs; seg++) {
+		buf_index = *(u16 *)(&iov[seg].iov_base);

That's ugly, and also not consistent with rw_fixed, because iov_base is
used to calculate offset.

Would offset be applicable when using readv/writev?

My thinkig was that for those cases, each iovec should be used exactly as registered.


+		if (unlikely(buf_index < 0 || buf_index >= ctx->nr_user_bufs))
+			return -EFAULT;
+
+		index = array_index_nospec(buf_index, ctx->nr_user_bufs);
+		imu = io_buf_from_index(ctx, index);
+		if (!imu->ubuf || !imu->len)
+			return -EFAULT;
+		if (iov[seg].iov_len > imu->len)
+			return -EFAULT;
+
+		iov[seg].iov_base = (void *)imu->ubuf;

Nope, that's not different from non registered version.
What import_fixed actually do is setting up the iter argument to point
to a bvec (a vector of struct page *).

So in fact, the buffers end up being pinned again because they are not being as bvecs?


So it either would need to keep a vector of bvecs, that's a vector of vectors,
that's not supported by iter, etc., so you'll also need to iterate over them
in io_read/write and so on. Or flat 2D structure into 1D, but that's still ugly.

So you're saying there's no clean way to create a readv/writev + fixed buffers API? It would've been nice to have a consistent API between files and buffers.


@@ -5692,7 +5743,7 @@ static int io_timeout_remove_prep(struct io_kiocb *req,
  {
  	if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
  		return -EINVAL;
-	if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+	if (unlikely(req->flags & (REQ_F_FIXED_RSRC | REQ_F_BUFFER_SELECT)))

Why it's here?

#define REQ_F_FIXED_RSRC	(REQ_F_FIXED_FILE | REQ_F_FIXED_BUFFER)
So, why do you | with REQ_F_BUFFER_SELECT again here?

I thought to group fixed files/buffers but distinguish them from selected buffers.

@@ -87,6 +88,8 @@ enum {
  #define IOSQE_ASYNC		(1U << IOSQE_ASYNC_BIT)
  /* select buffer from sqe->buf_group */
  #define IOSQE_BUFFER_SELECT	(1U << IOSQE_BUFFER_SELECT_BIT)
+/* use fixed buffer set */
+#define IOSQE_FIXED_BUFFER	(1U << IOSQE_FIXED_BUFFER_BIT)

Unfortenatuly, we're almost out of flags bits -- it's a 1 byte
field and 6 bits are already taken. Let's not use it.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux