On 10/24/24 15:40, Jens Axboe wrote:
On 10/24/24 8:26 AM, Pavel Begunkov wrote:
On 10/24/24 15:18, Jens Axboe wrote:
On 10/23/24 8:38 PM, hexue wrote:
On 9/25/2024 12:12, Pavel Begunkov wrote:
...
When the number of threads exceeds the number of CPU cores,the
database throughput does not increase significantly. However,
hybrid polling can releasing some CPU resources during the polling
process, so that part of the CPU time can be used for frequent
data processing and other operations, which speeds up the reading
process, thereby improving throughput and optimizaing database
performance.I tried different compression strategies and got
results similar to the above table.(~30% throughput improvement)
As more database applications adapt to the io_uring engine, I think
the application of hybrid poll may have potential in some scenarios.
Thanks for posting some numbers on that part, that's useful. I do
think the feature is useful as well, but I still have some issues
with the implementation. Below is an incremental patch on top of
yours to resolve some of those, potentially. Issues:
1) The patch still reads a bit like a hack, in that it doesn't seem to
care about following the current style. This reads a bit lazy/sloppy
or unfinished. I've fixed that up.
2) Appropriate member and function naming.
3) Same as above, it doesn't care about proper placement of additions to
structs. Again this is a bit lazy and wasteful, attention should be
paid to where additions are placed to not needlessly bloat
structures, or place members in cache unfortunate locations. For
example, available_time is just placed at the end of io_ring_ctx,
why? It's a submission side member, and there's room with other
related members. Not only is the placement now where you'd want it to
be, memory wise, it also doesn't add 8 bytes to io_uring_ctx.
4) Like the above, the io_kiocb bloat is, by far, the worst. Seems to me
that we can share space with the polling hash_node. This obviously
needs careful vetting, haven't done that yet. IOPOLL setups should
not be using poll at all. This needs extra checking. The poll_state
can go with cancel_seq_set, as there's a hole there any. And just
like that, rather than add 24b to io_kiocb, it doesn't take any extra
space at all.
5) HY_POLL is a terrible name. It's associated with IOPOLL, and so let's
please use a name related to that. And require IOPOLL being set with
HYBRID_IOPOLL, as it's really a variant of that. Makes it clear that
HYBRID_IOPOLL is really just a mode of operation for IOPOLL, and it
can't exist without that.
Please take a look at this incremental and test it, and then post a v9
that looks a lot more finished. Caveat - I haven't tested this one at
all. Thanks!
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index c79ee9fe86d4..6cf6a45835e5 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -238,6 +238,8 @@ struct io_ring_ctx {
struct io_rings *rings;
struct percpu_ref refs;
+ u64 poll_available_time;
+
clockid_t clockid;
enum tk_offsets clock_offset;
@@ -433,9 +435,6 @@ struct io_ring_ctx {
struct page **sqe_pages;
struct page **cq_wait_page;
-
- /* for io_uring hybrid poll*/
- u64 available_time;
};
struct io_tw_state {
@@ -647,9 +646,22 @@ struct io_kiocb {
atomic_t refs;
bool cancel_seq_set;
+ bool poll_state;
As mentioned briefly before, that can be just a req->flags flag
That'd be even better, I generally despise random bool addition.
struct io_task_work io_task_work;
- /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
- struct hlist_node hash_node;
+ union {
+ /*
+ * for polled requests, i.e. IORING_OP_POLL_ADD and async armed
+ * poll
+ */
+ struct hlist_node hash_node;
+ /*
+ * For IOPOLL setup queues, with hybrid polling
+ */
+ struct {
+ u64 iopoll_start;
+ u64 iopoll_end;
And IIRC it doesn't need to store the end as it's used immediately
after it's set in the same function.
Nice, that opens up the door for less esoteric sharing as well. And
yeah, I'd just use:
runtime = ktime_get_ns() - req->iopoll_start - sleep_time;
in io_uring_hybrid_poll() and kill it entirely, doesn't even need a
local variable there. And then shove iopoll_start into the union with
comp_list/apoll_events.
That's with what the current request is hooked into the list,
IOW such aliasing will corrupt the request
--
Pavel Begunkov