Re: IORING_OP_POLL_ADD slower than linux-aio IOCB_CMD_POLL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/22 9:21 AM, Jens Axboe wrote:
> On 4/19/22 6:31 AM, Jens Axboe wrote:
>> On 4/19/22 6:21 AM, Avi Kivity wrote:
>>> On 19/04/2022 15.04, Jens Axboe wrote:
>>>> On 4/19/22 5:57 AM, Avi Kivity wrote:
>>>>> On 19/04/2022 14.38, Jens Axboe wrote:
>>>>>> On 4/19/22 5:07 AM, Avi Kivity wrote:
>>>>>>> A simple webserver shows about 5% loss compared to linux-aio.
>>>>>>>
>>>>>>>
>>>>>>> I expect the loss is due to an optimization that io_uring lacks -
>>>>>>> inline completion vs workqueue completion:
>>>>>> I don't think that's it, io_uring never punts to a workqueue for
>>>>>> completions.
>>>>>
>>>>> I measured this:
>>>>>
>>>>>
>>>>>
>>>>>   Performance counter stats for 'system wide':
>>>>>
>>>>>           1,273,756 io_uring:io_uring_task_add
>>>>>
>>>>>        12.288597765 seconds time elapsed
>>>>>
>>>>> Which exactly matches with the number of requests sent. If that's the
>>>>> wrong counter to measure, I'm happy to try again with the correct
>>>>> counter.
>>>> io_uring_task_add() isn't a workqueue, it's task_work. So that is
>>>> expected.
> 
> Might actually be implicated. Not because it's a async worker, but
> because I think we might be losing some affinity in this case. Looking
> at traces, we're definitely bouncing between the poll completion side
> and then execution the completion.
> 
> Can you try this hack? It's against -git + for-5.19/io_uring. If you let
> me know what base you prefer, I can do a version against that. I see
> about a 3% win with io_uring with this, and was slower before against
> linux-aio as you saw as well.

Another thing to try - get rid of the IPI for TWA_SIGNAL, which I
believe may be the underlying cause of it.


diff --git a/fs/io-wq.c b/fs/io-wq.c
index 32aeb2c581c5..59987dd212d8 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -871,7 +871,7 @@ static bool io_wq_for_each_worker(struct io_wqe *wqe,
 
 static bool io_wq_worker_wake(struct io_worker *worker, void *data)
 {
-	set_notify_signal(worker->task);
+	set_notify_signal(worker->task, true);
 	wake_up_process(worker->task);
 	return false;
 }
@@ -991,7 +991,7 @@ static bool __io_wq_worker_cancel(struct io_worker *worker,
 {
 	if (work && match->fn(work, match->data)) {
 		work->flags |= IO_WQ_WORK_CANCEL;
-		set_notify_signal(worker->task);
+		set_notify_signal(worker->task, true);
 		return true;
 	}
 
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 3c8b34876744..ac1f14973e09 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -359,10 +359,10 @@ static inline void clear_notify_signal(void)
  * Called to break out of interruptible wait loops, and enter the
  * exit_to_user_mode_loop().
  */
-static inline void set_notify_signal(struct task_struct *task)
+static inline void set_notify_signal(struct task_struct *task, bool need_ipi)
 {
 	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_SIGNAL) &&
-	    !wake_up_state(task, TASK_INTERRUPTIBLE))
+	    !wake_up_state(task, TASK_INTERRUPTIBLE) && need_ipi)
 		kick_process(task);
 }
 
diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 5d03a2ad1066..bff53f539933 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -367,7 +367,7 @@ static void klp_send_signals(void)
 			 * Send fake signal to all non-kthread tasks which are
 			 * still not migrated.
 			 */
-			set_notify_signal(task);
+			set_notify_signal(task, true);
 		}
 	}
 	read_unlock(&tasklist_lock);
diff --git a/kernel/task_work.c b/kernel/task_work.c
index c59e1a49bc40..47d7024dc499 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -51,7 +51,7 @@ int task_work_add(struct task_struct *task, struct callback_head *work,
 		set_notify_resume(task);
 		break;
 	case TWA_SIGNAL:
-		set_notify_signal(task);
+		set_notify_signal(task, false);
 		break;
 	default:
 		WARN_ON_ONCE(1);

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux