On 18/11/2020 10:35, dongjoo seo wrote: > I agree with your opinion. And your patch is also good approach. > How about combine it? Adaptive solution with 3/4. I couldn't disclose numbers back then, but thanks to a steep skewed latency distribution of NAND/SSDs, it actually was automatically adjusting it to ~3/4 for QD1 and long enough requests (~75+ us). Also, if "max(sleep_ns, half_mean)" is removed, it was keeping the time below 1/2 for fast requests (less than ~30us), and that is a good thing because it was constantly oversleeping them. Though new ultra low-latency SSDs came since then. The real problem is to find anyone who actually uses it, otherwise it's just a chunk of dead code. Do you? Anyone? I remember once it was completely broken for months, but that was barely noticed. > Because, if we get the intensive workloads then we need to > decrease the whole cpu utilization even with [1]. > > [1] https://lkml.org/lkml/2019/4/30/117 <https://lkml.org/lkml/2019/4/30/117> > >> On Nov 18, 2020, at 6:26 PM, Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: >> >> On 18/11/2020 07:16, Damien Le Moal wrote: >>> On 2020/11/18 16:07, Christoph Hellwig wrote: >>>> Adding Damien who wrote this code. >>> >>> Nope. It wasn't me. I think it was Stephen Bates: >>> >>> commit 720b8ccc4500 ("blk-mq: Add a polling specific stats function") >>> >>> So +Stephen. >>>> >>>> On Wed, Nov 18, 2020 at 09:47:46AM +0900, Dongjoo Seo wrote: >>>>> Current sleep time for hybrid polling is half of mean time. >>>>> The 'half' sleep time is good for minimizing the cpu utilization. >>>>> But, the problem is that its cpu utilization is still high. >>>>> this patch can help to minimize the cpu utilization side. >> >> This won't work well. When I was experimenting I saw that half mean >> is actually is too much for fast enough requests, like <20us 4K writes, >> it's oversleeping them. Even more I'm afraid of getting in a vicious >> cycle, when oversleeping increases statistical mean, that increases >> sleep time, that again increases stat mean, and so on. That what >> happened for me when the scheme was too aggressive. >> >> I actually sent once patches [1] for automatic dynamic sleep time >> adjustment, but nobody cared. >> >> [1] https://lkml.org/lkml/2019/4/30/117 <https://lkml.org/lkml/2019/4/30/117> >> >>>>> >>>>> Below 1,2 is my test hardware sets. >>>>> >>>>> 1. Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz + Samsung 970 pro 1Tb >>>>> 2. Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz + INTEL SSDPED1D480GA 480G >>>>> >>>>> | Classic Polling | Hybrid Polling | this Patch >>>>> ----------------------------------------------------------------- >>>>> cpu util | IOPS(k) | cpu util | IOPS | cpu util | IOPS | >>>>> ----------------------------------------------------------------- >>>>> 1. 99.96 | 491 | 56.98 | 467 | 35.98 | 442 | >>>>> ----------------------------------------------------------------- >>>>> 2. 99.94 | 582 | 56.3 | 582 | 35.28 | 582 | >>>>> >>>>> cpu util means that sum of sys and user util. >>>>> >>>>> I used 4k rand read for this test. >>>>> because that case is worst case of I/O performance side. >>>>> below one is my fio setup. >>>>> >>>>> name=pollTest >>>>> ioengine=pvsync2 >>>>> hipri >>>>> direct=1 >>>>> size=100% >>>>> randrepeat=0 >>>>> time_based >>>>> ramp_time=0 >>>>> norandommap >>>>> refill_buffers >>>>> log_avg_msec=1000 >>>>> log_max_value=1 >>>>> group_reporting >>>>> filename=/dev/nvme0n1 >>>>> [rd_rnd_qd_1_4k_1w] >>>>> bs=4k >>>>> iodepth=32 >>>>> numjobs=[num of cpus] >>>>> rw=randread >>>>> runtime=60 >>>>> write_bw_log=bw_rd_rnd_qd_1_4k_1w >>>>> write_iops_log=iops_rd_rnd_qd_1_4k_1w >>>>> write_lat_log=lat_rd_rnd_qd_1_4k_1w >>>>> >>>>> Thanks >>>>> >>>>> Signed-off-by: Dongjoo Seo <commisori28@xxxxxxxxx> >>>>> --- >>>>> block/blk-mq.c | 3 +-- >>>>> 1 file changed, 1 insertion(+), 2 deletions(-) >>>>> >>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>>> index 1b25ec2fe9be..c3d578416899 100644 >>>>> --- a/block/blk-mq.c >>>>> +++ b/block/blk-mq.c >>>>> @@ -3749,8 +3749,7 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q, >>>>> return ret; >>>>> >>>>> if (q->poll_stat[bucket].nr_samples) >>>>> - ret = (q->poll_stat[bucket].mean + 1) / 2; >>>>> - >>>>> + ret = (q->poll_stat[bucket].mean + 1) * 3 / 4; >>>>> return ret; >>>>> } >>>>> >>>>> -- >>>>> 2.17.1 >>>>> >>>> ---end quoted text--- >>>> >>> >>> >> >> -- >> Pavel Begunkov > > -- Pavel Begunkov