On 1/4/23 1:34?PM, Jens Axboe wrote: > On 1/4/23 1:28?PM, Pavel Begunkov wrote: >> On 1/4/23 18:08, Jens Axboe wrote: >>> On 1/2/23 8:04?PM, Pavel Begunkov wrote: >>>> Don't use ->cq_wait for ring polling but add a separate wait queue for >>>> it. We need it for following patches. >>>> >>>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> >>>> --- >>>> include/linux/io_uring_types.h | 1 + >>>> io_uring/io_uring.c | 3 ++- >>>> io_uring/io_uring.h | 9 +++++++++ >>>> 3 files changed, 12 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h >>>> index dcd8a563ab52..cbcd3aaddd9d 100644 >>>> --- a/include/linux/io_uring_types.h >>>> +++ b/include/linux/io_uring_types.h >>>> @@ -286,6 +286,7 @@ struct io_ring_ctx { >>>> unsigned cq_entries; >>>> struct io_ev_fd __rcu *io_ev_fd; >>>> struct wait_queue_head cq_wait; >>>> + struct wait_queue_head poll_wq; >>>> unsigned cq_extra; >>>> } ____cacheline_aligned_in_smp; >>>> >>> >>> Should we move poll_wq somewhere else, more out of the way? >> >> If we care about polling perf and cache collisions with >> cq_wait, yeah we can. In any case it's a good idea to at >> least move it after cq_extra. >> >>> Would need to gate the check a flag or something. >> >> Not sure I follow > > I guess I could've been a bit more verbose... If we consider poll on the > io_uring rather uncommon, then moving the poll_wq outside of the hotter > cq_wait cacheline(s) would make sense. Each wait_queue_head is more than > a cacheline. Then we could have a flag in a spot that's hot anyway > whether to check it or not, eg in that same section as cq_wait. > > Looking at the layout right now, we're at 116 bytes for that section, or > two cachelines with 12 bytes to spare. If we add poll_wq, then we'll be > at 196 bytes, which is 4 bytes over the next cacheline. So it'd > essentially double the size of that section. If we moved it outside of > the aligned sections, then it'd pack better. Just after writing this, I noticed that a spinlock took 64 bytes... In other words, I have LOCKDEP enabled. The correct number is 24 bytes for wait_queue_head which is obviously a lot more reasonable. It'd still make that section one more cacheline since it's now at 60 bytes and would grow to 84 bytes. But it's obviously not as big of a problem as I had originally assumed. -- Jens Axboe