Re: [PATCH 0/3] decouple work_list protection from the big wqe->lock

Jens Axboe <axboe@xxxxxxxxx> · Fri, 11 Feb 2022 09:55:25 -0700

On 2/6/22 2:52 AM, Hao Xu wrote:
> wqe->lock is abused, it now protects acct->work_list, hash stuff,
> nr_workers, wqe->free_list and so on. Lets first get the work_list out
> of the wqe-lock mess by introduce a specific lock for work list. This
> is the first step to solve the huge contension between work insertion
> and work consumption.
> good thing:
>   - split locking for bound and unbound work list
>   - reduce contension between work_list visit and (worker's)free_list.
> 
> For the hash stuff, since there won't be a work with same file in both
> bound and unbound work list, thus they won't visit same hash entry. it
> works well to use the new lock to protect hash stuff.
> 
> Results:
> set max_unbound_worker = 4, test with echo-server:
> nice -n -15 ./io_uring_echo_server -p 8081 -f -n 1000 -l 16
> (-n connection, -l workload)
> before this patch:
> Samples: 2M of event 'cycles:ppp', Event count (approx.): 1239982111074
> Overhead  Command          Shared Object         Symbol
>   28.59%  iou-wrk-10021    [kernel.vmlinux]      [k] native_queued_spin_lock_slowpath
>    8.89%  io_uring_echo_s  [kernel.vmlinux]      [k] native_queued_spin_lock_slowpath
>    6.20%  iou-wrk-10021    [kernel.vmlinux]      [k] _raw_spin_lock
>    2.45%  io_uring_echo_s  [kernel.vmlinux]      [k] io_prep_async_work
>    2.36%  iou-wrk-10021    [kernel.vmlinux]      [k] _raw_spin_lock_irqsave
>    2.29%  iou-wrk-10021    [kernel.vmlinux]      [k] io_worker_handle_work
>    1.29%  io_uring_echo_s  [kernel.vmlinux]      [k] io_wqe_enqueue
>    1.06%  iou-wrk-10021    [kernel.vmlinux]      [k] io_wqe_worker
>    1.06%  io_uring_echo_s  [kernel.vmlinux]      [k] _raw_spin_lock
>    1.03%  iou-wrk-10021    [kernel.vmlinux]      [k] __schedule
>    0.99%  iou-wrk-10021    [kernel.vmlinux]      [k] tcp_sendmsg_locked
> 
> with this patch:
> Samples: 1M of event 'cycles:ppp', Event count (approx.): 708446691943
> Overhead  Command          Shared Object         Symbol
>   16.86%  iou-wrk-10893    [kernel.vmlinux]      [k] native_queued_spin_lock_slowpat
>    9.10%  iou-wrk-10893    [kernel.vmlinux]      [k] _raw_spin_lock
>    4.53%  io_uring_echo_s  [kernel.vmlinux]      [k] native_queued_spin_lock_slowpat
>    2.87%  iou-wrk-10893    [kernel.vmlinux]      [k] io_worker_handle_work
>    2.57%  iou-wrk-10893    [kernel.vmlinux]      [k] _raw_spin_lock_irqsave
>    2.56%  io_uring_echo_s  [kernel.vmlinux]      [k] io_prep_async_work
>    1.82%  io_uring_echo_s  [kernel.vmlinux]      [k] _raw_spin_lock
>    1.33%  iou-wrk-10893    [kernel.vmlinux]      [k] io_wqe_worker
>    1.26%  io_uring_echo_s  [kernel.vmlinux]      [k] try_to_wake_up
> 
> spin_lock failure from 25.59% + 8.89% =  34.48% to 16.86% + 4.53% = 21.39%
> TPS is similar, while cpu usage is from almost 400% to 350%

I think this looks like a good start to improving the io-wq locking. I
didnt spot anything immediately wrong with the series, my only worker
was worker->flags protection, but I _think_ that looks OK to in terms of
the worker itself doing the manipulations.

Let's queue this up for 5.18 testing, thanks!

-- 
Jens Axboe