Re: [PATCH v3 08/13] io_uring: implement fixed buffers registration similar to fixed files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Because it's do quiesce, fixed read/write access buffers from asynchronous
contexts without synchronisation. That won't work anymore, so

1. either we save it in advance, that would require extra req_async
allocation for linked fixed rw

2. or synchronise whenever async. But that would mean that a request
may get and do IO on two different buffers, that's rotten.

3. do mixed -- lazy, but if do IO then alloc.

3.5 also "synchronise" there would mean uring_lock, that's not welcome,
but we can probably do rcu.

Are you referring to a case where a fixed buffer request can be submitted from async context while those buffers are being unregistered, or something like that?

Let me think of a patch...

The most convenient API would be [1], it selects a buffer during
submission, but allocates if needs to go async or for all linked
requests.

[2] should be correct from the kernel perspective (no races), it
also solves doing IO on 2 different buffers, that's nasty (BTW,
[1] solves this problem naturally). However, a buffer might be
selected async, but the following can happen, and user should
wait for request completion before removing a buffer.

1. register buf id=0
2. syscall io_uring_enter(submit=RW_FIXED,buf_id=0,IOSQE_ASYNC)
3. unregister buffers
4. the request may not find the buffer and fail.

Not very convenient + can actually add overhead on the userspace
side, can be even some heavy synchronisation.

uring_lock in [2] is not nice, but I think I can replace it
with rcu, probably can even help with sharing, but I need to
try to implement to be sure.

So that's an open question what API to have.
Neither of diffs is tested.

[1]
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7e35283fc0b1..2171836a9ce3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -826,6 +826,7 @@ static const struct io_op_def io_op_defs[] = {
  		.needs_file		= 1,
  		.unbound_nonreg_file	= 1,
  		.pollin			= 1,
+		.needs_async_data	= 1,
  		.plug			= 1,
  		.async_size		= sizeof(struct io_async_rw),
  		.work_flags		= IO_WQ_WORK_BLKCG | IO_WQ_WORK_MM,
@@ -835,6 +836,7 @@ static const struct io_op_def io_op_defs[] = {
  		.hash_reg_file		= 1,
  		.unbound_nonreg_file	= 1,
  		.pollout		= 1,
+		.needs_async_data	= 1,
  		.plug			= 1,
  		.async_size		= sizeof(struct io_async_rw),
  		.work_flags		= IO_WQ_WORK_BLKCG | IO_WQ_WORK_FSIZE |



[2]
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7e35283fc0b1..31560b879fb3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3148,7 +3148,12 @@ static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
  	opcode = req->opcode;
  	if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) {
  		*iovec = NULL;
-		return io_import_fixed(req, rw, iter);
+
+		io_ring_submit_lock(req->ctx, needs_lock);
+		lockdep_assert_held(&req->ctx->uring_lock);
+		ret = io_import_fixed(req, rw, iter);
+		io_ring_submit_unlock(req->ctx, needs_lock);
+		return ret;
  	}
/* buffer index only valid with fixed read/write, or buffer select */
@@ -3638,7 +3643,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
  copy_iov:
  		/* some cases will consume bytes even on error returns */
  		iov_iter_revert(iter, io_size - iov_iter_count(iter));
-		ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
+		ret = io_setup_async_rw(req, iovec, inline_vecs, iter, true);
  		if (!ret)
  			return -EAGAIN;
  	}



For my understanding, is [1] essentially about stashing the iovec for the fixed IO in an io_async_rw struct and referencing it in async context? I don't understand how this prevents unregistering the buffer (described by the iovec) while the IO takes place.

Taking a step back, what is the cost of keeping the quiesce for buffer registration operations? It should not be a frequent operation even a heavy handed quiesce should not be a big issue?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux