On 5/30/22 07:38, Hao Xu wrote:
On 5/30/22 06:50, Pavel Begunkov wrote:
On 5/29/22 19:40, Jens Axboe wrote:
On 5/29/22 12:07 PM, Hao Xu wrote:
On 5/30/22 00:25, Jens Axboe wrote:
On 5/29/22 10:20 AM, Hao Xu wrote:
From: Hao Xu <howeyxu@xxxxxxxxxxx>
From: Hao Xu <howeyxu@xxxxxxxxxxx>
Use per list lock for cancel_hash, this removes some completion lock
invocation and remove contension between different cancel_hash entries
Interesting, do you have any numbers on this?
Just Theoretically for now, I'll do some tests tomorrow. This is
actually RFC, forgot to change the subject.
Also, I'd make a hash bucket struct:
struct io_hash_bucket {
spinlock_t lock;
struct hlist_head list;
};
rather than two separate structs, that'll have nicer memory locality too
and should further improve it. Could be done as a prep patch with the
old locking in place, making the end patch doing the per-bucket lock
simpler as well.
Sure, if the test number make sense, I'll send v2. I'll test the
hlist_bl list as well(the comment of it says it is much slower than
normal spin_lock, but we may not care the efficiency of poll
cancellation very much?).
I don't think the bit spinlocks are going to be useful, we should
stick with a spinlock for this. They are indeed slower and generally not
used for that reason. For a use case where you need a ton of locks and
saving the 4 bytes for a spinlock would make sense (or maybe not
changing some struct?), maybe they have a purpose. But not for this.
We can put the cancel hashes under uring_lock and completely kill
the hash spinlocking (2 lock/unlock pairs per single-shot). The code
below won't even compile and missing cancellation bits, I'll pick it
up in a week.
Even better would be to have two hash tables, and auto-magically apply
the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held)
and apoll (need uring_lock after anyway).
@@ -7332,9 +7345,13 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
return 0;
}
- spin_lock(&ctx->completion_lock);
- io_poll_req_insert(req);
- spin_unlock(&ctx->completion_lock);
+ if (ctx->flags & IORING_MUTEX_HASH) {
Does IORING_MUTEX_HASH exclude IOSQE_ASYNC as well? though IOSQE_ASYNC
is uncommon but users may do that.
It should, io-wq -> arm_poll() doesn't hold uring_lock, good point
--
Pavel Begunkov