在 2022/5/7 上午1:19, Pavel Begunkov 写道:
On 5/6/22 08:01, Hao Xu wrote:
From: Hao Xu <howeyxu@xxxxxxxxxxx>
For operations like accept, multishot is a useful feature, since we can
reduce a number of accept sqe. Let's integrate it to fast poll, it may
be good for other operations in the future.
Signed-off-by: Hao Xu <howeyxu@xxxxxxxxxxx>
---
fs/io_uring.c | 41 ++++++++++++++++++++++++++---------------
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8ebb1a794e36..d33777575faf 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5952,7 +5952,7 @@ static void io_poll_remove_entries(struct
io_kiocb *req)
* either spurious wakeup or multishot CQE is served. 0 when it's
done with
* the request, then the mask is stored in req->cqe.res.
*/
-static int io_poll_check_events(struct io_kiocb *req, bool locked)
+static int io_poll_check_events(struct io_kiocb *req, bool *locked)
{
struct io_ring_ctx *ctx = req->ctx;
int v;
@@ -5981,17 +5981,26 @@ static int io_poll_check_events(struct
io_kiocb *req, bool locked)
/* multishot, just fill an CQE and proceed */
if (req->cqe.res && !(req->apoll_events & EPOLLONESHOT)) {
- __poll_t mask = mangle_poll(req->cqe.res &
req->apoll_events);
- bool filled;
-
- spin_lock(&ctx->completion_lock);
- filled = io_fill_cqe_aux(ctx, req->cqe.user_data, mask,
- IORING_CQE_F_MORE);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- if (unlikely(!filled))
- return -ECANCELED;
- io_cqring_ev_posted(ctx);
+ if (req->flags & REQ_F_APOLL_MULTISHOT) {
+ io_tw_lock(req->ctx, locked);
+ if (likely(!(req->task->flags & PF_EXITING)))
+ io_queue_sqe(req);
That looks dangerous, io_queue_sqe() usually takes the request ownership
and doesn't expect that someone, i.e. io_poll_check_events(), may still be
actively using it.
E.g. io_accept() fails on fd < 0, return an error,
io_queue_sqe() -> io_queue_async() -> io_req_complete_failed()
kills it. Then io_poll_check_events() and polling in general
carry on using the freed request => UAF. Didn't look at it
too carefully, but there might other similar cases.
I checked this when I did the coding, it seems the only case is
while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
uses req again after req recycled in io_queue_sqe() path like you
pointed out above, but this case should be ok since we haven't
reuse the struct req{} at that point.
In my first version, I skiped the do while{} in io_poll_check_events()
for multishot apoll and do the reap in io_req_task_submit()
static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
{
int ret;
ret = io_poll_check_events(req, locked);
if (ret > 0)
return;
__io_poll_clean(req);
if (!ret)
io_req_task_submit(req, locked); <------here
else
io_req_complete_failed(req, ret);
}
But the disadvantage is in high frequent workloads case, it may loop in
io_poll_check_events for long time, then finally generating cqes in the
above io_req_task_submit() which is not good in terms of latency.