On 05/02/2021 07:21, Hao Xu wrote: > 在 2021/2/4 下午7:00, Pavel Begunkov 写道: >> On 04/02/2021 09:31, Hao Xu wrote: >>> Hi all, >>> Sorry for disturb all of you. Here comes my question. >>> When we close a uring file, we go into io_uring_flush(), >>> there is codes at the end: >>> >>> if (!(ctx->flags & IORING_SETUP_SQPOLL) || ctx->sqo_task == current) >>> io_uring_del_task_file(file); >>> >>> My understanding, this is to delete the ctx(associated with the uring >>> file) from current->io_uring->xa. >>> I'm thinking of this scenario: the task to close uring file is not the >>> one which created the uring file. >>> Then it doesn't make sense to delete the uring file from current->io_uring->xa. It should be "delete uring file from >>> ctx->sqo_task->io_uring->xa" instead. >> >> 1. It's not only about created or not, look for >> io_uring_add_task_file() call sites. >> >> 2. io_uring->xa is basically a map from task to used by it urings. >> Every user task should clean only its own context (SQPOLL task is >> a bit different), it'll be hell bunch of races otherwise. >> >> 3. If happens that it's closed by a task that has nothing to do >> with this ctx, then it won't find anything in its >> task->io_uring->xa, and so won't delete anything, and that's ok. >> io_uring->xa of sqo_task will be cleaned by sqo_task, either >> on another close() or on exit() (see io_uring_files_cancel). >> >> 4. There is a bunch of cases where that scheme doesn't behave >> nice, but at least should not leak/fault when all related tasks >> are killed. >> > Thank you Pavel for the detail explanation. I got it, basically > just delay the clean work to sqo_task. > I have this question since I'm looking into the tctx->inflight, it puzzles me a little bit. When a task exit(), it finally calls > __io_uring_task_cancel(), where we wait until tctx->inflight is 0. > What does tctx->inflight actually mean? I thought it stands for all > the inflight reqs of ctxs of this task. But in tctx_inflight(): > > /* > * If we have SQPOLL rings, then we need to iterate and find them, and > * add the pending count for those. > */ > xa_for_each(&tctx->xa, index, file) { > struct io_ring_ctx *ctx = file->private_data; > > if (ctx->flags & IORING_SETUP_SQPOLL) { > struct io_uring_task *__tctx = ctx->sqo_task->io_uring; > > inflight += percpu_counter_sum(&__tctx->inflight); > } > } > > Why it adds ctx->sqo_task->io_uring->inflight. > In a scenario like this: > taskA->tctx: ctx0 ctx1 > sqpoll normal > > Since ctx0->sqo_task is taskA, so isn't taskA->io_uring->inflight calculated twice? > In another hand, count of requests submited by sqthread will be added to sqthread->io_uring, do we ommit this part?with that being said, should taskA wait for sqes/reqs created by taskA but handled by sqthread? Hah, yes it's known and tctx_inflight() always returns 0 in this case :) I'm patching it for 5.12 because it's rather bulky, and with some of recent 5.11 fixes for now the whole thing should do what we want in terms of cancellations. But good catch -- Pavel Begunkov