Re: [PATCH v3 01/18] block: introduce duration-limits priority class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/25/23 07:43, Bart Van Assche wrote:
> On 1/24/23 13:29, Damien Le Moal wrote:
>> I/O priority at the device level does not exist with SAS and with SATA,
>> the ACS specifications mandates that NCQ I/O priority and CDL cannot be
>> used mixed together. So from the device point of view, I/O priority and
>> CDL are mutually exclusive. No issues.
>>
>> Now, if you are talking about the host level I/O priority scheduling done
>> by mq-deadline and bfq, the CDL priority class maps to the RT class. They
>> are the same, as they should. There is nothing more real-time than CDL in
>> my opinion :)
>>
>> Furthermore, if we do not reuse the I/O priority interface, we will have
>> to add another field to BIOs & requests to propagate the cdl index from
>> user space down to the device LLD, almost exactly in the manner of I/O
>> priorities, including all the controls with merging etc. That would be a
>> lot of overhead to achieve the possibility of prioritized CDL commands.
>>
>> CDL in of itself allows the user to define "prioritized" commands by
>> defining CDLs on the drive that are sorted in increasing time limit order,
>> i.e. with low CDL index numbers having low limits, and higher priority
>> within the class (as CDL index == prio level). With that, schedulers can
>> still do the right thing as they do now, with the additional benefit that
>> they can even be improved to base their scheduling decisions on a known
>> time limit for the command execution. But such optimization is not
>> implemented by this series.
> 
> Hi Damien,
> 
> What if a device that supports I/O priorities (e.g. an NVMe device that 
> supports configuring the SQ priority) and a device that supports command 
> duration limits (e.g. a SATA hard disk) are combined via the device 
> mapper into a single block device? Should I/O be submitted to the dm 
> device with one of the existing I/O priority classes (not supported by 
> SATA hard disks) or with I/O priority class IOPRIO_CLASS_DL (not 
> supported by NVMe devices)?

That is not a use case we considered. My gut feeling is that this is
something the target driver should handle when processing a user IO.
Note that I was not aware that Linux NVMe driver supported queue priorities...

> Shouldn't the ATA core translate the existing I/O priority levels into a 
> command duration limit instead of introducing a new I/O priority class 
> that is only supported by ATA devices?

There is only one priority class that ATA understands: RT (the level is
irrelevant and ignored). All RT class IOs are mapped to high priority NCQ
commands. All other classes map to normal priority (no priority bit set)
commands.

And sure, we could map the level of RT class IOs to a CDL index, as we do
for the CDL class, but what would be the point ? The user should use the
CDL class in that case.

Furthermore, there is one additional thing that we do not yet support but
will later: CDL descriptor 0 can be used to set a target time limit for
high priority NCQ commands. Without this new feature introduced with CDL,
the drive is free to schedule high priority NCQ commands as it wants, and
that is hard coded in FW. So you can endup with very aggressive scheduling
leading to significant overall IOPS drop and long tail latency for low
priority commands. See page 11 and 20 of this presentation for an example:

https://www.snia.org/sites/default/files/SDC/2021/pdfs/SNIA-SDC21-LeMoal-Be-On-Time-command-duration-limits-Feature-Support-in%20Linux.pdf

For a drive that supports both CDL and NCQ priority, with CDL feature
turned off, CDL descriptor 0 defines the time limit hint for high priority
NCQ commands. Again, CDL and NCQ high priority are mutually exclusive.

So for clarity, I really would prefer separating CDL and RT classes as we
did. We could integrate CDL support reusing the RT class + level for CDL
index, but I think this may be very confusing for users, especially
considering that the CLDs on a drive can be defined in any order the user
wants, resulting in indexes/levels that does do not have any particular
order, making it impossible for the host to correctly schedule commands.


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux