Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14.03.2022 14:16, Matias Bjørling wrote:
>> Agreed. Supporting non-power of two sizes in the block layer is
>> fairly easy as shown by some of the patches seens in this series.
>> Supporting them properly in the whole ecosystem is not trivial and
>> will create a long-term burden.  We could do that, but we'd rather
>> have a really good reason for it, and right now I don't see that.

I think that Bo's use-case is an example of a major upstream Linux host that is
struggling with unmmapped LBAs. Can we focus on this use-case and the parts
that we are missing to support Bytedance?

Any application that uses zoned storage devices would have to manage
unmapped LBAs due to the potential of zones being/becoming offline (no
reads/writes allowed). Eliminating the difference between zone cap and
zone size will not remove this requirement, and holes will continue to
exist. Furthermore, writing to LBAs across zones is not allowed by the
specification and must also be managed.

Given the above, applications have to be conscious of zones in general and work within their boundaries. I don't understand how applications can work without having per-zone knowledge. An application would have to know about zones and their writeable capacity. To decide where and how data is written, an application must manage writing across zones, specific offline zones, and (currently) its writeable capacity. I.e., knowledge about zones and holes is required for writing to zoned devices and isn't eliminated by removing the PO2 zone size requirement.

Supporting offlines zones is optional in the ZNS spec? We are not
considering supporting this in the host. This will be handled by the
device for exactly maintaining the SW stack simpler.

For years, the PO2 requirement has been known in the Linux community and by the ZNS SSD vendors. Some SSD implementors have chosen not to support PO2 zone sizes, which is a perfectly valid decision. But its implementors knowingly did that while knowing that the Linux kernel didn't support it.

I want to turn the argument around to see it from the kernel developer's point of view. They have communicated the PO2 requirement clearly, there's good precedence working with PO2 zone sizes, and at last, holes can't be avoided and are part of the overall design of zoned storage devices. So why should the kernel developer's take on the long-term maintenance burden of NPO2 zone sizes?

You have a good point, and that is the question we need to help answer.
As I see it, requirements evolve and the kernel changes with it as long
as there are active upstream users for it.

The main constraint for PO2 is removed in the block layer, we have Linux
hosts stating that unmapped LBAs are a problem, and we have HW
supporting size=capacity.

I would be happy to hear what else you would like to see for this to be
of use to the kernel community.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux