RE: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Javier González <javier@xxxxxxxxxxx>
> Sent: Tuesday, 15 March 2022 14.53
> To: Christoph Hellwig <hch@xxxxxx>
> Cc: Matias Bjørling <Matias.Bjorling@xxxxxxx>; Damien Le Moal
> <damien.lemoal@xxxxxxxxxxxxxxxxxx>; Luis Chamberlain
> <mcgrof@xxxxxxxxxx>; Keith Busch <kbusch@xxxxxxxxxx>; Pankaj Raghav
> <p.raghav@xxxxxxxxxxx>; Adam Manzanares
> <a.manzanares@xxxxxxxxxxx>; jiangbo.365@xxxxxxxxxxxxx; kanchan Joshi
> <joshi.k@xxxxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; Sagi Grimberg
> <sagi@xxxxxxxxxxx>; Pankaj Raghav <pankydev8@xxxxxxxxx>; Kanchan Joshi
> <joshiiitr@xxxxxxxxx>; linux-block@xxxxxxxxxxxxxxx; linux-
> nvme@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices
> 
> On 15.03.2022 14:30, Christoph Hellwig wrote:
> >On Tue, Mar 15, 2022 at 02:26:11PM +0100, Javier González wrote:
> >> but we do not see a usage for ZNS in F2FS, as it is a mobile
> >> file-system. As other interfaces arrive, this work will become natural.
> >>
> >> ZoneFS and butrfs are good targets for ZNS and these we can do. I
> >> would still do the work in phases to make sure we have enough early
> >> feedback from the community.
> >>
> >> Since this thread has been very active, I will wait some time for
> >> Christoph and others to catch up before we start sending code.
> >
> >Can someone summarize where we stand?  Between the lack of quoting from
> >hell and overly long lines from corporate mail clients I've mostly
> >stopped reading this thread because it takes too much effort actually
> >extract the information.
> 
> Let me give it a try:
> 
>   - PO2 emulation in NVMe is a no-go. Drop this.
> 
>   - The arguments against supporting PO2 are:
>       - It makes ZNS depart from a SMR assumption of PO2 zone sizes. This
>         can create confusion for users of both SMR and ZNS
> 
>       - Existing applications assume PO2 zone sizes, and probably do
>         optimizations for these. These applications, if wanting to use
>         ZNS will have to change the calculations
> 
>       - There is a fear for performance regressions.
> 
>       - It adds more work to you and other maintainers
> 
>   - The arguments in favour of PO2 are:
>       - Unmapped LBAs create holes that applications need to deal with.
>         This affects mapping and performance due to splits. Bo explained
>         this in a thread from Bytedance's perspective.  I explained in an
>         answer to Matias how we are not letting zones transition to
>         offline in order to simplify the host stack. Not sure if this is
>         something we want to bring to NVMe.
> 
>       - As ZNS adds more features and other protocols add support for
>         zoned devices we will have more use-cases for the zoned block
>         device. We will have to deal with these fragmentation at some
>         point.
> 
>       - This is used in production workloads in Linux hosts. I would
>         advocate for this not being off-tree as it will be a headache for
>         all in the future.
> 
>   - If you agree that removing PO2 is an option, we can do the following:
>       - Remove the constraint in the block layer and add ZoneFS support
>         in a first patch.
> 
>       - Add btrfs support in a later patch
> 
>       - Make changes to tools once merged
> 
> Hope I have collected all points of view in such a short format.

+ Suggestion to enable all users in the kernel to limit fragmentation and maintainer burden.
+ Possible not a big issue as users already have added the necessary support and users already must manage offline zones and avoid writing across zones. 
+ Re: Bo's email, it sounds like this only affect a single vendor which knowingly made the decision to do NPO2 zone sizes. From Bo: "(What we discussed here has a precondition that is, we cannot determine if the SSD provider could change the FW to make it PO2 or not)").  




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux