RE: Polled io for Linux kernel 5.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Keith, it makes sense to reserve and set it up uniquely if you can save hw interrupts. But why would io_uring then not need these queues, because a stack trace I ran shows without the special queues I am still entering bio_poll. With pvsync2 I can only do polled io with the poll_queues?
 
Does io_uring avoid the shared resources?



-----Original Message-----
From: Keith Busch <kbusch@xxxxxxxxxx> 
Sent: Thursday, December 19, 2019 12:52 PM
To: Ober, Frank <frank.ober@xxxxxxxxx>
Cc: linux-block@xxxxxxxxxxxxxxx; linux-nvme@xxxxxxxxxxxxxxxxxxx; Derrick, Jonathan <jonathan.derrick@xxxxxxxxx>; Rajendiran, Swetha <swetha.rajendiran@xxxxxxxxx>; Liang, Mark <mark.liang@xxxxxxxxx>
Subject: Re: Polled io for Linux kernel 5.x

On Thu, Dec 19, 2019 at 07:25:51PM +0000, Ober, Frank wrote:
> Hi block/nvme communities,
> On 4.x kernels we used to be able to do:
> # echo 1 > /sys/block/nvme0n1/queue/io_poll And then run a polled_io 
> job in fio with pvsync2 as our ioengine, with the hipri flag set. This is actually how we test the very best SSDs that depend on 3D xpoint media.
> 
> On 5.x kernels we see the following error trying to write the device 
> settings>>>
> -bash: echo: write error: Invalid argument
> 
> We can reload the entire nvme module with nvme poll_queues but this is not well explained or written up anywhere? Or sorry "not found"?
> 
> This is verifiable on 5.3, 5.4 kernels with fio 3.16 builds.
> 
> What is the background on what has changed because Jens wrote this note back in 2015, which did work in the 4.x kernel era.

The original polling implementation shared resources that generate interrupts. This prevents it from running as fast as it can, so dedicated polling queues are used now.

> How come we cannot have a device/controller level setup of polled io today in 5.x kernels, all that exists is module based?

Polled queues are a dedicated resource that we have to reserve up front.
They're optional, so you don't need to use the hipri flag if you have a device you don't want polled. But we need to know how many queues to reserve before we've even discovered the controllers, so we don't have a good way to define it per-controller.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux