On 18/11/2020 07:16, Damien Le Moal wrote: > On 2020/11/18 16:07, Christoph Hellwig wrote: >> Adding Damien who wrote this code. > > Nope. It wasn't me. I think it was Stephen Bates: > > commit 720b8ccc4500 ("blk-mq: Add a polling specific stats function") > > So +Stephen. >> >> On Wed, Nov 18, 2020 at 09:47:46AM +0900, Dongjoo Seo wrote: >>> Current sleep time for hybrid polling is half of mean time. >>> The 'half' sleep time is good for minimizing the cpu utilization. >>> But, the problem is that its cpu utilization is still high. >>> this patch can help to minimize the cpu utilization side. This won't work well. When I was experimenting I saw that half mean is actually is too much for fast enough requests, like <20us 4K writes, it's oversleeping them. Even more I'm afraid of getting in a vicious cycle, when oversleeping increases statistical mean, that increases sleep time, that again increases stat mean, and so on. That what happened for me when the scheme was too aggressive. I actually sent once patches [1] for automatic dynamic sleep time adjustment, but nobody cared. [1] https://lkml.org/lkml/2019/4/30/117 >>> >>> Below 1,2 is my test hardware sets. >>> >>> 1. Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz + Samsung 970 pro 1Tb >>> 2. Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz + INTEL SSDPED1D480GA 480G >>> >>> | Classic Polling | Hybrid Polling | this Patch >>> ----------------------------------------------------------------- >>> cpu util | IOPS(k) | cpu util | IOPS | cpu util | IOPS | >>> ----------------------------------------------------------------- >>> 1. 99.96 | 491 | 56.98 | 467 | 35.98 | 442 | >>> ----------------------------------------------------------------- >>> 2. 99.94 | 582 | 56.3 | 582 | 35.28 | 582 | >>> >>> cpu util means that sum of sys and user util. >>> >>> I used 4k rand read for this test. >>> because that case is worst case of I/O performance side. >>> below one is my fio setup. >>> >>> name=pollTest >>> ioengine=pvsync2 >>> hipri >>> direct=1 >>> size=100% >>> randrepeat=0 >>> time_based >>> ramp_time=0 >>> norandommap >>> refill_buffers >>> log_avg_msec=1000 >>> log_max_value=1 >>> group_reporting >>> filename=/dev/nvme0n1 >>> [rd_rnd_qd_1_4k_1w] >>> bs=4k >>> iodepth=32 >>> numjobs=[num of cpus] >>> rw=randread >>> runtime=60 >>> write_bw_log=bw_rd_rnd_qd_1_4k_1w >>> write_iops_log=iops_rd_rnd_qd_1_4k_1w >>> write_lat_log=lat_rd_rnd_qd_1_4k_1w >>> >>> Thanks >>> >>> Signed-off-by: Dongjoo Seo <commisori28@xxxxxxxxx> >>> --- >>> block/blk-mq.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index 1b25ec2fe9be..c3d578416899 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -3749,8 +3749,7 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q, >>> return ret; >>> >>> if (q->poll_stat[bucket].nr_samples) >>> - ret = (q->poll_stat[bucket].mean + 1) / 2; >>> - >>> + ret = (q->poll_stat[bucket].mean + 1) * 3 / 4; >>> return ret; >>> } >>> >>> -- >>> 2.17.1 >>> >> ---end quoted text--- >> > > -- Pavel Begunkov