在 2021/9/11 下午6:49, Pavel Begunkov 写道:
On 9/9/21 9:29 AM, Hao Xu wrote:
在 2021/9/9 下午3:01, Hao Xu 写道:
在 2021/9/8 下午8:03, Pavel Begunkov 写道:
On 9/8/21 12:21 PM, Hao Xu wrote:
在 2021/9/7 下午2:48, Hao Xu 写道:
在 2021/9/7 上午3:04, Pavel Begunkov 写道:
On 9/3/21 12:00 PM, Hao Xu wrote:
For operations like accept, multishot is a useful feature, since we can
reduce a number of accept sqe. Let's integrate it to fast poll, it may
be good for other operations in the future.
Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
---
fs/io_uring.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d6df60c4cdb9..dae7044e0c24 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5277,8 +5277,15 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
return;
}
- hash_del(&req->hash_node);
- io_poll_remove_double(req);
+ if (READ_ONCE(apoll->poll.canceled))
+ apoll->poll.events |= EPOLLONESHOT;
+ if (apoll->poll.events & EPOLLONESHOT) {
+ hash_del(&req->hash_node);
+ io_poll_remove_double(req);
+ } else {
+ add_wait_queue(apoll->poll.head, &apoll->poll.wait);
It looks like it does both io_req_task_submit() and adding back
to the wq, so io_issue_sqe() may be called in parallel with
io_async_task_func(). If so, there will be tons of all kind of
races.
IMHO, io_async_task_func() is called in original context one by
one(except PF_EXITING is set, it is also called in system-wq), so
shouldn't be parallel case there.
ping...
fwiw, the case we're talking about:
CPU0 | CPU1
io_async_task_func() |
-> add_wait_queue(); |
-> io_req_task_submit(); |
/* no tw run happened in between */
| io_async_task_func()
| --> io_req_task_submit()
We called io_req_task_submit() twice without running tw in-between,
both of the calls use the same req->io_task_work.node field in the
request for accounting, and so the second call will screw
tctx->task_list and not only by not considering that
req->io_task_work.node is already taken/enqueued.
io_req_task_work_add() {
wq_list_add_tail(&req->io_task_work.node, &tctx->task_list);
}
I guess you mean io_req_task_work_add() called by async_wake() two times:
io_async_task_func()
-> add_wait_queue()
async_wake()
->io_req_task_work_add()
this one mess up the running task_work list
since req->io_task_work.node is in use.
It seems the current poll_add + multishot logic has this issue too, I'll
give it a shot(simply clean req->io_task_work.node before running
req->io_task_work.func should work)
Similar issue for double wait entry since we didn't remove double entry
in interrupt handler:
Yep, sounds like that. Polling needs reworking, and not only
because of this one.
Currently I'm working on these type of poll issues, will send something
soon.
async_wake() --> io_req_task_work_add()
io_poll_double_wake()-->async_wake()-->io_req_task_work_add()
+ }
+
spin_unlock(&ctx->completion_lock);
if (!READ_ONCE(apoll->poll.canceled))
@@ -5366,7 +5373,7 @@ static int io_arm_poll_handler(struct io_kiocb *req)
struct io_ring_ctx *ctx = req->ctx;
struct async_poll *apoll;
struct io_poll_table ipt;
- __poll_t ret, mask = EPOLLONESHOT | POLLERR | POLLPRI;
+ __poll_t ret, mask = POLLERR | POLLPRI;
int rw;
if (!req->file || !file_can_poll(req->file))
@@ -5388,6 +5395,8 @@ static int io_arm_poll_handler(struct io_kiocb *req)
rw = WRITE;
mask |= POLLOUT | POLLWRNORM;
}
+ if (!(req->flags & REQ_F_APOLL_MULTISHOT))
+ mask |= EPOLLONESHOT;
/* if we can't nonblock try, then no point in arming a poll handler */
if (!io_file_supports_nowait(req, rw))