On 3/15/22 22:05, Javier González wrote: >>> The main constraint for (1) PO2 is removed in the block layer, we >>> have (2) Linux hosts stating that unmapped LBAs are a problem, >>> and we have (3) HW supporting size=capacity. >>> >>> I would be happy to hear what else you would like to see for this >>> to be of use to the kernel community. >> >> (Added numbers to your paragraph above) >> >> 1. The sysfs chunksize attribute was "misused" to also represent >> zone size. What has changed is that RAID controllers now can use a >> NPO2 chunk size. This wasn't meant to naturally extend to zones, >> which as shown in the current posted patchset, is a lot more work. > > True. But this was the main constraint for PO2. And as I said, users asked for it. >> 2. Bo mentioned that the software already manages holes. It took a >> bit of time to get right, but now it works. Thus, the software in >> question is already capable of working with holes. Thus, fixing >> this, would present itself as a minor optimization overall. I'm not >> convinced the work to do this in the kernel is proportional to the >> change it'll make to the applications. > > I will let Bo response himself to this. > >> 3. I'm happy to hear that. However, I'll like to reiterate the >> point that the PO2 requirement have been known for years. That >> there's a drive doing NPO2 zones is great, but a decision was made >> by the SSD implementors to not support the Linux kernel given its >> current implementation. > > Zone devices has been supported for years in SMR, and I this is a > strong argument. However, ZNS is still very new and customers have > several requirements. I do not believe that a HDD stack should have > such an impact in NVMe. > > Also, we will see new interfaces adding support for zoned devices in > the future. > > We should think about the future and not the past. Backward compatibility ? We must not break userspace... >> >> All that said - if there are people willing to do the work and it >> doesn't have a negative impact on performance, code quality, >> maintenance complexity, etc. then there isn't anything saying >> support can't be added - but it does seem like it’s a lot of work, >> for little overall benefits to applications and the host users. > > Exactly. > > Patches in the block layer are trivial. This is running in > production loads without issues. I have tried to highlight the > benefits in previous benefits and I believe you understand them. The block layer is not the issue here. We all understand that one is easy. > Support for ZoneFS seems easy too. We have an early POC for btrfs and > it seems it can be done. We sign up for these 2. zonefs can trivially support non power of 2 zone sizes, but as zonefs creates a discrete view of the device capacity with its one file per zone interface, an application accesses to a zone are forcibly limited to that zone, as they should. With zonefs, pow2 and nonpow2 devices will show the *same* interface to the application. Non power of 2 zone size then have absolutely no benefits at all. > As for F2FS and dm-zoned, I do not think these are targets at the > moment. If this is the path we follow, these will bail out at mkfs > time. And what makes you think that this is acceptable ? What guarantees do you have that this will not be a problem for users out there ? -- Damien Le Moal Western Digital Research