Re: [PATCH v2 10/11] block: Add support for the zone capacity concept

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/21/23 07:51, Bart Van Assche wrote:
> On 4/20/23 15:00, Damien Le Moal wrote:
>> On 4/21/23 02:12, Bart Van Assche wrote:
>>> On 4/20/23 02:23, Niklas Cassel wrote:
>>>> With your change above, we would start rejecting such devices.
>>>>
>>>> Is this reduction of supported NVMe ZNS SSD devices really desired?
>>>
>>> Hi Niklas,
>>>
>>> This is not my intention. A possible solution is to modify the NVMe
>>> driver and SCSI core such that the "zone is full" information is stored
>>> in struct request when a command completes. That will remove the need
>>> for the mq-deadline scheduler to know the zone capacity.
>>
>> I am not following... Why would the scheduler need to know the zone capacity ?
>>
>> If the user does stupid things like accessing sectors between zone capacity and
>> zone size or trying to write to a full zone, the commands will be failed by the
>> drive and I do not see why the scheduler need to care about that.
> 
> Hi Damien,
> 
> Restricting the number of active zones in the I/O scheduler (patch 
> 11/11) requires some knowledge of the zone condition.

Why would you need to handle the max active zone number restriction in the
scheduler ? That is the user responsibility. btrfs does it (that is still buggy
though, working on it).

> According to ZBC-2, for sequential write preferred zones the additional 
> sense code ZONE TRANSITION TO FULL must be reported if the zone 
> condition changes from not full into full. There is no such requirement 
> for sequential write required zones. Additionally, I'm not aware of any 
> reporting mechanism in the NVMe specification for changes in the zone 
> condition.

Sequential write preferred zones is ZBC which does not have the concept of
active zone. In general, for ZBC HDDs, ignoring the max number of open zones is
fine. There is no performance impact that can be measured, unless the user goes
full random write on the device. But in that case, the user is already asking
for bad perf anyway.

I suspect you are thinking about all this in the context of UFS devices ?
My point stands though. Trying to manage the active zone limit at the scheduler
level is not a good idea as there are no guarantees that the user will
eventually issue all the write commands to make zones full, and thus turn them
inactive. This is the responsibility of the user to manage that, so above the
block IO scheduler.

> The overhead of submitting a REPORT ZONES command after every I/O 
> completion would be unacceptable.
> 
> Is there any other solution for tracking the zone condition other than 
> comparing the LBA at which a WRITE command finished with the zone capacity?

The sd driver does some minimal tracking already which is used for zone append
emulation.

> 
> Did I perhaps overlook something?
> 
> Thanks,
> 
> Bart.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux