On 10/24/24 8:26 AM, Pavel Begunkov wrote: > On 10/24/24 15:18, Jens Axboe wrote: >> On 10/23/24 8:38 PM, hexue wrote: >>> On 9/25/2024 12:12, Pavel Begunkov wrote: > ... >>> When the number of threads exceeds the number of CPU cores,the >>> database throughput does not increase significantly. However, >>> hybrid polling can releasing some CPU resources during the polling >>> process, so that part of the CPU time can be used for frequent >>> data processing and other operations, which speeds up the reading >>> process, thereby improving throughput and optimizaing database >>> performance.I tried different compression strategies and got >>> results similar to the above table.(~30% throughput improvement) >>> >>> As more database applications adapt to the io_uring engine, I think >>> the application of hybrid poll may have potential in some scenarios. >> >> Thanks for posting some numbers on that part, that's useful. I do >> think the feature is useful as well, but I still have some issues >> with the implementation. Below is an incremental patch on top of >> yours to resolve some of those, potentially. Issues: >> >> 1) The patch still reads a bit like a hack, in that it doesn't seem to >> care about following the current style. This reads a bit lazy/sloppy >> or unfinished. I've fixed that up. >> >> 2) Appropriate member and function naming. >> >> 3) Same as above, it doesn't care about proper placement of additions to >> structs. Again this is a bit lazy and wasteful, attention should be >> paid to where additions are placed to not needlessly bloat >> structures, or place members in cache unfortunate locations. For >> example, available_time is just placed at the end of io_ring_ctx, >> why? It's a submission side member, and there's room with other >> related members. Not only is the placement now where you'd want it to >> be, memory wise, it also doesn't add 8 bytes to io_uring_ctx. >> >> 4) Like the above, the io_kiocb bloat is, by far, the worst. Seems to me >> that we can share space with the polling hash_node. This obviously >> needs careful vetting, haven't done that yet. IOPOLL setups should >> not be using poll at all. This needs extra checking. The poll_state >> can go with cancel_seq_set, as there's a hole there any. And just >> like that, rather than add 24b to io_kiocb, it doesn't take any extra >> space at all. >> >> 5) HY_POLL is a terrible name. It's associated with IOPOLL, and so let's >> please use a name related to that. And require IOPOLL being set with >> HYBRID_IOPOLL, as it's really a variant of that. Makes it clear that >> HYBRID_IOPOLL is really just a mode of operation for IOPOLL, and it >> can't exist without that. >> >> Please take a look at this incremental and test it, and then post a v9 >> that looks a lot more finished. Caveat - I haven't tested this one at >> all. Thanks! >> >> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h >> index c79ee9fe86d4..6cf6a45835e5 100644 >> --- a/include/linux/io_uring_types.h >> +++ b/include/linux/io_uring_types.h >> @@ -238,6 +238,8 @@ struct io_ring_ctx { >> struct io_rings *rings; >> struct percpu_ref refs; >> + u64 poll_available_time; >> + >> clockid_t clockid; >> enum tk_offsets clock_offset; >> @@ -433,9 +435,6 @@ struct io_ring_ctx { >> struct page **sqe_pages; >> struct page **cq_wait_page; >> - >> - /* for io_uring hybrid poll*/ >> - u64 available_time; >> }; >> struct io_tw_state { >> @@ -647,9 +646,22 @@ struct io_kiocb { >> atomic_t refs; >> bool cancel_seq_set; >> + bool poll_state; > > As mentioned briefly before, that can be just a req->flags flag That'd be even better, I generally despise random bool addition. >> struct io_task_work io_task_work; >> - /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */ >> - struct hlist_node hash_node; >> + union { >> + /* >> + * for polled requests, i.e. IORING_OP_POLL_ADD and async armed >> + * poll >> + */ >> + struct hlist_node hash_node; >> + /* >> + * For IOPOLL setup queues, with hybrid polling >> + */ >> + struct { >> + u64 iopoll_start; >> + u64 iopoll_end; > > And IIRC it doesn't need to store the end as it's used immediately > after it's set in the same function. Nice, that opens up the door for less esoteric sharing as well. And yeah, I'd just use: runtime = ktime_get_ns() - req->iopoll_start - sleep_time; in io_uring_hybrid_poll() and kill it entirely, doesn't even need a local variable there. And then shove iopoll_start into the union with comp_list/apoll_events. My main points are really: don't randomly sprinkle additions to structs. Think about if they are needed, and if they are, be a bit smarter about where to place them. The original patch did neither of those, and that's a non-starter. -- Jens Axboe