> -----Original Message----- > From: Javier González <javier@xxxxxxxxxxx> > Sent: Tuesday, 15 March 2022 14.53 > To: Christoph Hellwig <hch@xxxxxx> > Cc: Matias Bjørling <Matias.Bjorling@xxxxxxx>; Damien Le Moal > <damien.lemoal@xxxxxxxxxxxxxxxxxx>; Luis Chamberlain > <mcgrof@xxxxxxxxxx>; Keith Busch <kbusch@xxxxxxxxxx>; Pankaj Raghav > <p.raghav@xxxxxxxxxxx>; Adam Manzanares > <a.manzanares@xxxxxxxxxxx>; jiangbo.365@xxxxxxxxxxxxx; kanchan Joshi > <joshi.k@xxxxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; Sagi Grimberg > <sagi@xxxxxxxxxxx>; Pankaj Raghav <pankydev8@xxxxxxxxx>; Kanchan Joshi > <joshiiitr@xxxxxxxxx>; linux-block@xxxxxxxxxxxxxxx; linux- > nvme@xxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices > > On 15.03.2022 14:30, Christoph Hellwig wrote: > >On Tue, Mar 15, 2022 at 02:26:11PM +0100, Javier González wrote: > >> but we do not see a usage for ZNS in F2FS, as it is a mobile > >> file-system. As other interfaces arrive, this work will become natural. > >> > >> ZoneFS and butrfs are good targets for ZNS and these we can do. I > >> would still do the work in phases to make sure we have enough early > >> feedback from the community. > >> > >> Since this thread has been very active, I will wait some time for > >> Christoph and others to catch up before we start sending code. > > > >Can someone summarize where we stand? Between the lack of quoting from > >hell and overly long lines from corporate mail clients I've mostly > >stopped reading this thread because it takes too much effort actually > >extract the information. > > Let me give it a try: > > - PO2 emulation in NVMe is a no-go. Drop this. > > - The arguments against supporting PO2 are: > - It makes ZNS depart from a SMR assumption of PO2 zone sizes. This > can create confusion for users of both SMR and ZNS > > - Existing applications assume PO2 zone sizes, and probably do > optimizations for these. These applications, if wanting to use > ZNS will have to change the calculations > > - There is a fear for performance regressions. > > - It adds more work to you and other maintainers > > - The arguments in favour of PO2 are: > - Unmapped LBAs create holes that applications need to deal with. > This affects mapping and performance due to splits. Bo explained > this in a thread from Bytedance's perspective. I explained in an > answer to Matias how we are not letting zones transition to > offline in order to simplify the host stack. Not sure if this is > something we want to bring to NVMe. > > - As ZNS adds more features and other protocols add support for > zoned devices we will have more use-cases for the zoned block > device. We will have to deal with these fragmentation at some > point. > > - This is used in production workloads in Linux hosts. I would > advocate for this not being off-tree as it will be a headache for > all in the future. > > - If you agree that removing PO2 is an option, we can do the following: > - Remove the constraint in the block layer and add ZoneFS support > in a first patch. > > - Add btrfs support in a later patch > > - Make changes to tools once merged > > Hope I have collected all points of view in such a short format. + Suggestion to enable all users in the kernel to limit fragmentation and maintainer burden. + Possible not a big issue as users already have added the necessary support and users already must manage offline zones and avoid writing across zones. + Re: Bo's email, it sounds like this only affect a single vendor which knowingly made the decision to do NPO2 zone sizes. From Bo: "(What we discussed here has a precondition that is, we cannot determine if the SSD provider could change the FW to make it PO2 or not)").