On 3/14/22 16:35, Christoph Hellwig wrote: > On Sat, Mar 12, 2022 at 04:58:08PM +0900, Damien Le Moal wrote: >> The reason for the power of 2 requirement is 2 fold: >> 1) At the time we added zone support for SMR, chunk_sectors had to be a >> power of 2 number of sectors. >> 2) SMR users did request power of 2 zone sizes and that all zones have >> the same size as that simplified software design. There was even a >> de-facto agreement that 256MB zone size is a good compromise between >> usability and overhead of zone reclaim/GC. But that particular number is >> for HDD due to their performance characteristics. > > Also for NVMe we initially went down the road to try to support > non power of two sizes. But there was another major early host that > really wanted the power of two zone sizes to support hardware based > hosts that can cheaply do shifts but not divisions. The variable > zone capacity feature (something that Linux does not currently support) > is a feature requested by NVMe members on the host and device side > also can only be supported with the the zone size / zone capacity split. > >> The other solution would be adding a dm-unhole target to remap sectors >> to remove the holes from the device address space. Such target would be >> easy to write, but in my opinion, this would still not change the fact >> that applications still have to deal with error recovery and active/open >> zone resources. So they still have to be zone aware and operate per zone. > > I don't think we even need a new target for it. I think you can do > this with a table using multiple dm-linear sections already if you > want. Nope, this is currently not possible: DM requires the target zone size to be the same as the underlying device zone size. So that would not work. > >> My answer to your last question ("Are we sure?") is thus: No. I am not >> sure this is a good idea. But as always, I would be happy to be proven >> wrong. So far, I have not seen any argument doing that. > > Agreed. Supporting non-power of two sizes in the block layer is fairly > easy as shown by some of the patches seens in this series. Supporting > them properly in the whole ecosystem is not trivial and will create a > long-term burden. We could do that, but we'd rather have a really good > reason for it, and right now I don't see that. -- Damien Le Moal Western Digital Research