On 11/12/20 4:53 AM, Xiaoguang Wang wrote: > When IORING_SETUP_SQPOLL is enabled, io_uring will always handle sqes > in sq thread task context, so in io_iopoll_req_issued(), if we're not > in io worker context, we don't need to check whether should wake up > sq thread. io_iopoll_req_issued() calls wq_has_sleeper(), which has > smp_mb() memory barrier, perf shows obvious overhead: > Samples: 481K of event 'cycles', Event count (approx.): 299807382878 > Overhead Comma Shared Object Symbol > 3.69% :9630 [kernel.vmlinux] [k] io_issue_sqe > > With this patch, perf shows: > Samples: 482K of event 'cycles', Event count (approx.): 299929547283 > Overhead Comma Shared Object Symbol > 0.70% :4015 [kernel.vmlinux] [k] io_issue_sqe > > It shows some obvious improvements. Looks good to me, but: > @@ -2761,7 +2761,7 @@ static void io_iopoll_req_issued(struct io_kiocb *req) > else > list_add_tail(&req->inflight_entry, &ctx->iopoll_list); > > - if ((ctx->flags & IORING_SETUP_SQPOLL) && > + if (in_async && (ctx->flags & IORING_SETUP_SQPOLL) && > wq_has_sleeper(&ctx->sq_data->wait)) > wake_up(&ctx->sq_data->wait); > } This really needs a comment as to why we don't have to check and wake from this path. -- Jens Axboe