On 9/28/21 5:55 PM, Hao Xu wrote: > 在 2021/9/28 下午7:29, Pavel Begunkov 写道: [...] >> It solves the problem of total starvation of non-prio requests, e.g. >> if new completions coming as fast as you complete previous ones. One >> downside is that prio requests coming while we execute a previous >> batch will be executed only after a previous batch of non-prio >> requests, I don't think it's much of a problem but interesting to >> see numbers. > Actually this was one of my implementation, I splited it to two lists > explicitly most because the convience of 8/8 batch the tw in prior list. > I'll evaluate the overhead tomorrow. I guess so because it more resembles v1 but without inverting order of the IRQ sublist. >> >>> INIT_WQ_LIST(&tctx->task_list); >>> + INIT_WQ_LIST(&tctx->prior_task_list); >>> + tctx->nr = tctx->prior_nr = 0; >>> if (!node) >>> tctx->task_running = false; >>> spin_unlock_irq(&tctx->task_lock); >>> @@ -2166,7 +2174,7 @@ static void tctx_task_work(struct callback_head *cb) >>> ctx_flush_and_put(ctx, &locked); >>> } >>> -static void io_req_task_work_add(struct io_kiocb *req) >>> +static void io_req_task_work_add(struct io_kiocb *req, bool emergency) >> >> It think "priority" instead of "emergency" will be more accurate >> >>> { >>> struct task_struct *tsk = req->task; >>> struct io_uring_task *tctx = tsk->io_uring; >>> @@ -2178,7 +2186,13 @@ static void io_req_task_work_add(struct io_kiocb *req) >>> WARN_ON_ONCE(!tctx); >>> spin_lock_irqsave(&tctx->task_lock, flags); >>> - wq_list_add_tail(&req->io_task_work.node, &tctx->task_list); >>> + if (emergency && tctx->prior_nr * MAX_EMERGENCY_TW_RATIO < tctx->nr) { >>> + wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list); >>> + tctx->prior_nr++; >>> + } else { >>> + wq_list_add_tail(&req->io_task_work.node, &tctx->task_list); >>> + } >>> + tctx->nr++; >>> running = tctx->task_running; >>> if (!running) >>> tctx->task_running = true; -- Pavel Begunkov