Re: [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/28/24 5:04 PM, Jens Axboe wrote:
> On 5/28/24 12:31 PM, Jens Axboe wrote:
>> I suspect a bug in the previous patches, because this is what the
>> forward port looks like. First, for reference, the current results:
> 
> Got it sorted, and pinned sender and receiver on CPUs to avoid the
> variation. It looks like this with the task_work approach that I sent
> out as v1:
> 
> Latencies for: Sender
>     percentiles (nsec):
>      |  1.0000th=[ 2160],  5.0000th=[ 2672], 10.0000th=[ 2768],
>      | 20.0000th=[ 3568], 30.0000th=[ 3568], 40.0000th=[ 3600],
>      | 50.0000th=[ 3600], 60.0000th=[ 3600], 70.0000th=[ 3632],
>      | 80.0000th=[ 3632], 90.0000th=[ 3664], 95.0000th=[ 3696],
>      | 99.0000th=[ 4832], 99.5000th=[15168], 99.9000th=[16192],
>      | 99.9500th=[16320], 99.9900th=[18304]
> Latencies for: Receiver
>     percentiles (nsec):
>      |  1.0000th=[ 1528],  5.0000th=[ 1576], 10.0000th=[ 1656],
>      | 20.0000th=[ 2040], 30.0000th=[ 2064], 40.0000th=[ 2064],
>      | 50.0000th=[ 2064], 60.0000th=[ 2064], 70.0000th=[ 2096],
>      | 80.0000th=[ 2096], 90.0000th=[ 2128], 95.0000th=[ 2160],
>      | 99.0000th=[ 3472], 99.5000th=[14784], 99.9000th=[15168],
>      | 99.9500th=[15424], 99.9900th=[17280]
> 
> and here's the exact same test run on the current patches:
> 
> Latencies for: Sender
>     percentiles (nsec):
>      |  1.0000th=[  362],  5.0000th=[  362], 10.0000th=[  370],
>      | 20.0000th=[  370], 30.0000th=[  370], 40.0000th=[  370],
>      | 50.0000th=[  374], 60.0000th=[  382], 70.0000th=[  382],
>      | 80.0000th=[  382], 90.0000th=[  382], 95.0000th=[  390],
>      | 99.0000th=[  402], 99.5000th=[  430], 99.9000th=[  900],
>      | 99.9500th=[  972], 99.9900th=[ 1432]
> Latencies for: Receiver
>     percentiles (nsec):
>      |  1.0000th=[ 1528],  5.0000th=[ 1544], 10.0000th=[ 1560],
>      | 20.0000th=[ 1576], 30.0000th=[ 1592], 40.0000th=[ 1592],
>      | 50.0000th=[ 1592], 60.0000th=[ 1608], 70.0000th=[ 1608],
>      | 80.0000th=[ 1640], 90.0000th=[ 1672], 95.0000th=[ 1688],
>      | 99.0000th=[ 1848], 99.5000th=[ 2128], 99.9000th=[14272],
>      | 99.9500th=[14784], 99.9900th=[73216]
> 
> I'll try and augment the test app to do proper rated submissions, so I
> can ramp up the rates a bit and see what happens.

And the final one, with the rated sends sorted out. One key observation
is that v1 trails the current edition, it just can't keep up as the rate
is increased. If we cap the rate at at what should be 33K messages per
second, v1 gets ~28K messages and has the following latency profile (for
a 3 second run)

Latencies for: Receiver (msg=83863)
    percentiles (nsec):
     |  1.0000th=[  1208],  5.0000th=[  1336], 10.0000th=[  1400],
     | 20.0000th=[  1768], 30.0000th=[  1912], 40.0000th=[  1976],
     | 50.0000th=[  2040], 60.0000th=[  2160], 70.0000th=[  2256],
     | 80.0000th=[  2480], 90.0000th=[  2736], 95.0000th=[  3024],
     | 99.0000th=[  4080], 99.5000th=[  4896], 99.9000th=[  9664],
     | 99.9500th=[ 17024], 99.9900th=[218112]
Latencies for: Sender (msg=83863)
    percentiles (nsec):
     |  1.0000th=[  1928],  5.0000th=[  2064], 10.0000th=[  2160],
     | 20.0000th=[  2608], 30.0000th=[  2672], 40.0000th=[  2736],
     | 50.0000th=[  2864], 60.0000th=[  2960], 70.0000th=[  3152],
     | 80.0000th=[  3408], 90.0000th=[  4128], 95.0000th=[  4576],
     | 99.0000th=[  5920], 99.5000th=[  6752], 99.9000th=[ 13376],
     | 99.9500th=[ 22912], 99.9900th=[261120]

and the current edition does:

Latencies for: Sender (msg=94488)
    percentiles (nsec):
     |  1.0000th=[  181],  5.0000th=[  191], 10.0000th=[  201],
     | 20.0000th=[  215], 30.0000th=[  225], 40.0000th=[  235],
     | 50.0000th=[  262], 60.0000th=[  306], 70.0000th=[  430],
     | 80.0000th=[ 1004], 90.0000th=[ 2480], 95.0000th=[ 3632],
     | 99.0000th=[ 8096], 99.5000th=[12352], 99.9000th=[18048],
     | 99.9500th=[19584], 99.9900th=[23680]
Latencies for: Receiver (msg=94488)
    percentiles (nsec):
     |  1.0000th=[  342],  5.0000th=[  398], 10.0000th=[  482],
     | 20.0000th=[  652], 30.0000th=[  812], 40.0000th=[  972],
     | 50.0000th=[ 1240], 60.0000th=[ 1640], 70.0000th=[ 1944],
     | 80.0000th=[ 2448], 90.0000th=[ 3248], 95.0000th=[ 5216],
     | 99.0000th=[10304], 99.5000th=[12352], 99.9000th=[18048],
     | 99.9500th=[19840], 99.9900th=[23168]

If we cap it where v1 keeps up, at 13K messages per second, v1 does:

Latencies for: Receiver (msg=38820)
    percentiles (nsec):
     |  1.0000th=[ 1160],  5.0000th=[ 1256], 10.0000th=[ 1352],
     | 20.0000th=[ 1688], 30.0000th=[ 1928], 40.0000th=[ 1976],
     | 50.0000th=[ 2064], 60.0000th=[ 2384], 70.0000th=[ 2480],
     | 80.0000th=[ 2768], 90.0000th=[ 3280], 95.0000th=[ 3472],
     | 99.0000th=[ 4192], 99.5000th=[ 4512], 99.9000th=[ 6624],
     | 99.9500th=[ 8768], 99.9900th=[14272]
Latencies for: Sender (msg=38820)
    percentiles (nsec):
     |  1.0000th=[ 1848],  5.0000th=[ 1928], 10.0000th=[ 2040],
     | 20.0000th=[ 2608], 30.0000th=[ 2640], 40.0000th=[ 2736],
     | 50.0000th=[ 3024], 60.0000th=[ 3120], 70.0000th=[ 3376],
     | 80.0000th=[ 3824], 90.0000th=[ 4512], 95.0000th=[ 4768],
     | 99.0000th=[ 5536], 99.5000th=[ 6048], 99.9000th=[ 9024],
     | 99.9500th=[10304], 99.9900th=[23424]

and v2 does:

Latencies for: Sender (msg=39005)
    percentiles (nsec):
     |  1.0000th=[  191],  5.0000th=[  211], 10.0000th=[  262],
     | 20.0000th=[  342], 30.0000th=[  382], 40.0000th=[  402],
     | 50.0000th=[  450], 60.0000th=[  532], 70.0000th=[ 1080],
     | 80.0000th=[ 1848], 90.0000th=[ 4768], 95.0000th=[10944],
     | 99.0000th=[16512], 99.5000th=[18304], 99.9000th=[22400],
     | 99.9500th=[26496], 99.9900th=[41728]
Latencies for: Receiver (msg=39005)
    percentiles (nsec):
     |  1.0000th=[  410],  5.0000th=[  604], 10.0000th=[  700],
     | 20.0000th=[  900], 30.0000th=[ 1128], 40.0000th=[ 1320],
     | 50.0000th=[ 1672], 60.0000th=[ 2256], 70.0000th=[ 2736],
     | 80.0000th=[ 3760], 90.0000th=[ 5408], 95.0000th=[11072],
     | 99.0000th=[18304], 99.5000th=[20096], 99.9000th=[24704],
     | 99.9500th=[27520], 99.9900th=[35584]

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux