On 1/4/23 1:45?PM, Pavel Begunkov wrote: > On 1/4/23 20:34, Jens Axboe wrote: >> On 1/4/23 1:28?PM, Pavel Begunkov wrote: >>> On 1/4/23 18:08, Jens Axboe wrote: >>>> On 1/2/23 8:04?PM, Pavel Begunkov wrote: >>>>> Don't use ->cq_wait for ring polling but add a separate wait queue for >>>>> it. We need it for following patches. >>>>> >>>>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> >>>>> --- >>>>> include/linux/io_uring_types.h | 1 + >>>>> io_uring/io_uring.c | 3 ++- >>>>> io_uring/io_uring.h | 9 +++++++++ >>>>> 3 files changed, 12 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h >>>>> index dcd8a563ab52..cbcd3aaddd9d 100644 >>>>> --- a/include/linux/io_uring_types.h >>>>> +++ b/include/linux/io_uring_types.h >>>>> @@ -286,6 +286,7 @@ struct io_ring_ctx { >>>>> unsigned cq_entries; >>>>> struct io_ev_fd __rcu *io_ev_fd; >>>>> struct wait_queue_head cq_wait; >>>>> + struct wait_queue_head poll_wq; >>>>> unsigned cq_extra; >>>>> } ____cacheline_aligned_in_smp; >>>>> >>>> >>>> Should we move poll_wq somewhere else, more out of the way? >>> >>> If we care about polling perf and cache collisions with >>> cq_wait, yeah we can. In any case it's a good idea to at >>> least move it after cq_extra. >>> >>>> Would need to gate the check a flag or something. >>> >>> Not sure I follow >> >> I guess I could've been a bit more verbose... If we consider poll on the >> io_uring rather uncommon, then moving the poll_wq outside of the hotter >> cq_wait cacheline(s) would make sense. Each wait_queue_head is more than >> a cacheline. > > Looks it's 24B, and wait_queue_entry is uncomfortable 40B. (also see followup email). Yes, it's only 24 bytes indeed. >> Then we could have a flag in a spot that's hot anyway >> whether to check it or not, eg in that same section as cq_wait. >> Looking at the layout right now, we're at 116 bytes for that section, or >> two cachelines with 12 bytes to spare. If we add poll_wq, then we'll be >> at 196 bytes, which is 4 bytes over the next cacheline. So it'd >> essentially double the size of that section. If we moved it outside of >> the aligned sections, then it'd pack better. > > Than it's not about hotness and caches but rather memory > consumption due to padding, which is still a good argument. Right, it's nice to not keep io_ring_ctx bigger than it needs to be. And if moved out-of-line, then it'd pack better and we would not "waste" another cacheline on adding this wait_queue_head for polling. -- Jens Axboe