On 6/21/22 00:18, Dylan Yudaken wrote:
Task work currently uses a spin lock to guard task_list and task_running. Some use cases such as networking can trigger task_work_add from multiple threads all at once, which suffers from contention here. This can be changed to use a lockless list which seems to have better performance. Running the micro benchmark in [1] I see 20% improvment in multithreaded task work add. It required removing the priority tw list optimisation, however it isn't clear how important that optimisation is. Additionally it has fairly easy to break semantics. Patch 1-2 remove the priority tw list optimisation Patch 3-5 add lockless lists for task work Patch 6 fixes a bug I noticed in io_uring event tracing Patch 7-8 adds tracing for task_work_run
Compared to the spinlock overhead, the prio task list optimization is definitely unimportant, so I agree with removing it here. Replace the task list with llisy was something I considered but I gave it up since it changes the list to a stack which means we have to handle the tasks in a reverse order. This may affect the latency, do you have some numbers for it, like avg and 99% 95% lat?