On 2024-02-23 06:31, Jens Axboe wrote: > On 2/22/24 10:40 PM, David Wei wrote: >> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h >> index bd7071aeec5d..57318fc01379 100644 >> --- a/include/linux/io_uring_types.h >> +++ b/include/linux/io_uring_types.h >> @@ -425,6 +425,9 @@ struct io_ring_ctx { >> DECLARE_HASHTABLE(napi_ht, 4); >> #endif >> >> + /* iowait accounting */ >> + bool iowait_enabled; >> + > > Since this is just a single bit, you should put it in the top section > where we have other single bits for hot / read-mostly data. This avoids > needing something many cache lines away for the hotter wait path, and it > avoids growing the struct as there's still plenty of space there for > this. Got it, moved it to the cacheline aligned hot/read-only struct. $current_year is when I learnt about C bitfields. > >> diff --git a/io_uring/register.c b/io_uring/register.c >> index 99c37775f974..7cbc08544c4c 100644 >> --- a/io_uring/register.c >> +++ b/io_uring/register.c >> @@ -387,6 +387,12 @@ static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx, >> return ret; >> } >> >> +static int io_register_iowait(struct io_ring_ctx *ctx) >> +{ >> + ctx->iowait_enabled = true; >> + return 0; >> +} >> + >> static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, >> void __user *arg, unsigned nr_args) >> __releases(ctx->uring_lock) >> @@ -563,6 +569,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, >> break; >> ret = io_unregister_napi(ctx, arg); >> break; >> + case IORING_REGISTER_IOWAIT: >> + ret = -EINVAL; >> + if (arg || nr_args) >> + break; >> + ret = io_register_iowait(ctx); >> + break; > > > This only allows you to set it, not clear it. I think we want to make it > pass in the value, and pass back the previous. Something ala: > > static int io_register_iowait(struct io_ring_ctx *ctx, int val) > { > int was_enabled = ctx->iowait_enabled; > > if (val) > ctx->iowait_enabled = 1; > else > ctx->iowait_enabled = 0; > return was_enabled; > } > > and then: > > case IORING_REGISTER_IOWAIT: > ret = -EINVAL; > if (arg) > break; > ret = io_register_iowait(ctx, nr_args); > break; > When I first thought about this I wondered how to pass a value like int val through arg which is a void*. That's a clever use of nr_args. > I'd also add a: > > Fixes: 8a796565cec3 ("io_uring: Use io_schedule* in cqring wait") > > and mark it for stable, so we at least attempt to make it something that > can be depended on. Sorry, what does "mark it for stable" mean? >