> On Jan 2, 2024, at 7:05 PM, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: > > On Tue, Jan 02, 2024 at 11:02:59AM +0300, Viacheslav Dubeyko wrote: >> >> >>> On Jan 2, 2024, at 1:56 AM, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: >>> >>> LSF topic: bcachefs status & roadmap >>> >> >> <skipped> >> >>> >>> A delayed allocation for btree nodes mode is coming, which is the main >>> piece needed for ZNS support >>> >> >> I could miss some emails. But have you shared the vision of ZNS support >> architecture for the case of bcachefs already? It will be interesting to hear >> the high-level concept. > > There's not a whole lot to it. bcache/bcachefs allocation is already > bucket based, where the model is that we allocate a bucket, then write > to it sequentially and never overwrite until the whole bucket is reused. > > The main exception has been btree nodes, which are log structured and > typically smaller than a bucket; that doesn't break the "no overwrites" > property ZNS wants, but it does mean writes within a bucket aren't > happening sequentially. > > So I'm adding a mode where every time we do a btree node write we write > out the whole node to a new location, instead of appending at an > existing location. It won't be as efficient for random updates across a > large working set, but in practice that doesn't happen too much; average > btree write size has always been quite high on any filesystem I've > looked at. > > Aside from that, it's mostly just plumbing and integration; bcachefs on > ZNS will work pretty much just the same as bcachefs on regular block devices. I assume that you are aware about limited number of open/active zones on ZNS device. It means that you can open for write operations only N zones simultaneously (for example, 14 zones for the case of WDC ZNS device). Can bcachefs survive with such limitation? Can you limit the number of buckets for write operations? Another potential issue could be the zone size. WDC ZNS device introduces 2GB zone size (with 1GB capacity). Could be the bucket is so huge? And could btree model of operations works with such huge zones? Technically speaking, limitation (14 open/active zones) could be the factor of performance degradation. Could such limitation doesn’t effect the bcachefs performance? Could ZNS model affects a GC operations? Or, oppositely, ZNS model can help to manage GC operations more efficiently? Do you need in conventional zone? Could bcachefs work without using the conventional zone of ZNS device? Thanks, Slava.