Re: [io_uring] Problems using io_uring engine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/26/20 2:21 PM, Hamilton Tobon Mosquera wrote:
> On 26/05/20 3:18 p. m., Jens Axboe wrote:
> 
>> On 5/26/20 7:57 AM, Hamilton Tobon Mosquera wrote:
>>> Thank you for your answer.
>>>
>>> This is how I'm making sure that it is polling. The workloads take 2
>>> minutes, I'm checking the interrupts registered in /proc/interrupts for
>>> the nvme device (the Intel Optane) when the workload starts and when the
>>> workload ends. The interrupts count is almost zero, about 25 or so,
>>> while when using an interrupt based engine I get about 600K interrupts.
>>>
>>> Also, the way I'm loading the nvme driver is:
>>>
>>> modprobe nvme poll_queues=4
>>>
>>> As you said I'm using 4 polling queues because I only have 4 physical
>>> cores. To check that they were actually created I use:
>>>
>>> systool -vm nvme
>>>
>>> Which shows that effectively there are 4 polling queues created.
>>>
>>> I also checked the file /sys/block/nvme0n1/queue/io_poll and it is set
>>> to 1. Sometimes I change the file /sys/block/nvme0n1/queue/io_poll_delay
>>> to switch between hybrid and normal polling and it shows differences in
>>> the CPU usage, the latencies, IOPS, ...
>>>
>>> Another way is by checking the CPU usage, which says that the CPU is
>>> almost completely occupied when polling.
>>>
>>> Also, I tried with dmesg as you suggested and this is the output:
>>>
>>> [627676.640431] nvme nvme0: 4/0/4 default/read/poll queues
>>>
>>> I guess that shows that I was effectively using polling in the
>>> workloads. What is weird is that when I don't use the flag HIPRI it runs
>>> ok but using interrupts not polling. It might be important to say that
>>> I'm always running with root user.
>>>
>>> Does this information give you more hints about the problem?. Could you
>>> please tell me in what filesystem polling is known to work 100% of the
>>> time?.
>>>
>>> Thank you for your help.
>> You did the right thing on the NVMe side, I'm guessing then that it's
>> ext4 again. What kernel are you using? I think only 5.7 and newer
>> supports polling on ext4, you'll have better luck with XFS.
>>
>> And btw, please don't top-post. Reply with proper quoting, top
>> posting totally messes up the flow of conversation.
>>
>> --
>> Jens Axboe
> 
> 
> Thank you for your answer.
> 
> Effectively it seems to be ext4 the problem. I tried with XFS and it 
> works, which seems weird to me. Does this mean that pvsync2 wasn't 
> polling at all?.

There's basically two types of polling:

- sync polling, this is what was introduced with preadv2 and RWF_HIPRI
- async polling, this allows to poll for explicit IO

The former just polls the device for _any_ completion, the latter can
poll for an explicit IO. The latter is what io_uring uses, as the sync
polling doesn't really work that well for a single sync IO from a single
poll user.

To support async polling, the fs needs to support it. ext4 only recently
got that support, I added XFS support when I wrote the code originally.

-- 
Jens Axboe




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux