Re: hybrid polling on an nvme doesn't seem to work with iodepth > 1 on 5.10.0-rc5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/2020 01:19, Andres Freund wrote:
> On 2020-12-10 23:15:15 +0000, Pavel Begunkov wrote:
>> On 10/12/2020 23:12, Pavel Begunkov wrote:
>>> On 10/12/2020 20:51, Andres Freund wrote:
>>>> Hi,
>>>>
>>>> When using hybrid polling (i.e echo 0 >
>>>> /sys/block/nvme1n1/queue/io_poll_delay) I see stalls with fio when using
>>>> an iodepth > 1. Sometimes fio hangs, other times the performance is
>>>> really poor. I reproduced this with SSDs from different vendors.
>>>
>>> Can you get poll stats from debugfs while running with hybrid?
>>> For both iodepth=1 and 32.
>>
>> Even better if for 32 you would show it in dynamic, i.e. cat it several
>> times while running it.
> 
> Should read all email before responding...
> 
> This is a loop of grepping for 4k writes (only type I am doing), with 1s
> interval. I started it before the fio run (after one with
> iodepth=1). Once the iodepth 32 run finished (--timeout 10, but took
> 42s0, I started a --iodepth 1 run.

Thanks! Your mean grows to more than 30s, so it'll sleep for 15s for each
IO. Yep, the sleep time calculation is clearly broken for you.

In general the current hybrid polling doesn't work well with high QD,
that's because statistics it based on are not very resilient to all sorts
of problems. And it might be a problem I described long ago

https://www.spinics.net/lists/linux-block/msg61479.html
https://lkml.org/lkml/2019/4/30/120


Are you interested in it just out of curiosity, or you have a good
use case? Modern SSDs are so fast that even with QD1 the sleep overhead
on sleeping getting considerable, all the more so for higher QD.
Because if there is no one who really cares, then instead of adding
elaborated correction schemes, I'd rather put max(time, 10ms) and
that's it.

> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=3002, mean=7402, min=6683, max=22498
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=517838676, min=517774856, max=517901274
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=7365701186, min=7365642813, max=7365756630
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> 
> Shortly after this I started the iodepth=1 run:
> 
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=32, mean=30203322069, min=30203263000, max=30203381351
> write (4096 Bytes): samples=1, mean=2216868822, min=2216868822, max=2216868822
> write (4096 Bytes): samples=1, mean=2216868822, min=2216868822, max=2216868822
> write (4096 Bytes): samples=1, mean=2216851683, min=2216851683, max=2216851683
> write (4096 Bytes): samples=1, mean=1108526485, min=1108526485, max=1108526485
> write (4096 Bytes): samples=1, mean=1108522634, min=1108522634, max=1108522634
> write (4096 Bytes): samples=1, mean=277274275, min=277274275, max=277274275
> write (4096 Bytes): samples=19, mean=5787160, min=5496432, max=10087444
> write (4096 Bytes): samples=1185, mean=67915, min=66408, max=145100
> write (4096 Bytes): samples=1185, mean=67915, min=66408, max=145100
> write (4096 Bytes): samples=1185, mean=67915, min=66408, max=145100
> write (4096 Bytes): samples=1703, mean=50492, min=39200, max=13155316
> write (4096 Bytes): samples=9983, mean=7408, min=6648, max=29950
> write (4096 Bytes): samples=9980, mean=7395, min=6574, max=23454
> write (4096 Bytes): samples=10011, mean=7381, min=6620, max=25533
> write (4096 Bytes): samples=9381, mean=7936, min=7270, max=47315
> write (4096 Bytes): samples=9295, mean=7377, min=6665, max=23490
> write (4096 Bytes): samples=9987, mean=7415, min=6629, max=23352
> write (4096 Bytes): samples=9992, mean=7411, min=6651, max=23071
> write (4096 Bytes): samples=9404, mean=7941, min=7234, max=24193
> write (4096 Bytes): samples=9434, mean=7942, min=7240, max=62745
> write (4096 Bytes): samples=5370, mean=7935, min=7268, max=24116
> write (4096 Bytes): samples=5370, mean=7935, min=7268, max=24116
> write (4096 Bytes): samples=5370, mean=7935, min=7268, max=24116

-- 
Pavel Begunkov



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux