On 1/16/24 02:35, Kent Overstreet wrote: > On Mon, Jan 15, 2024 at 11:22:36AM +0300, Viacheslav Dubeyko wrote: >> Hello, >> >> I would like to suggest the discussion related to current >> status of ZNS SSD support in file systems. There is ongoing >> process of ZNS SSD support in bcachefs, btrfs, ssdfs. >> The primary intention is to have a meeting place among >> file system developers and ZNS SSD manufactures for sharing >> and discussing the status of ZNS SSD support, existing issues, >> and potential new features. >> >> The goals of the discussion are: >> (1) share the current status of ZNS SSD support, >> (2) discuss any potential issues of ZNS SSD support in file systems, >> (3) discuss file system's techniques required for ZNS SSD support, >> (4) discuss potential re-using/sharing of implemented logic/primitives, >> (5) share the priliminary estimation of having stable ZNS SSD support, >> (6) performance, reliability estimation comparing ZNS and conventional SSDs. >> >> Also, it will be great to hear any news from ZNS SSD vendors >> related to new features of ZNS SSDs (zone size, open/active zone >> limitation, and so on). Do we have any progress with increasing >> number of open/active zones? Any hope to have various zone sizes, etc? >> >> POTENTIAL ATTENDEES: >> bcachefs - Kent Overstreet >> btrfs - Naohiro Aota >> ssdfs - Viacheslav Dubeyko >> WDC - Matias Bjørling >> Samsung - Javier González >> >> Anybody else would like to join the discussion? >> >> Thanks, >> Slava > > There's also SMR hard drives to consider. For SMR, the much bigger zones > means that we don't want to burn entire zones on the superblock (plural; > we need two so that one will be alive while the other is being erased). Hmmm... The zone size of SMR drives is actually much smaller than that of ZNS drives: 256 MB vs over 1GB for ZNS. All host-managed SMR drives that I know of use 256 MB zone size. One exception is 128 MB zone size that some user prefer over the regular 256 MB. Depending on the drive, this can be changed with the FORMAT WITH PRESET command, if the drive support that command of course. btrfs superblock (and its copies) are handled as you describe: 2 zones per copy used as a circular write ring. The write pointer location of the zones indicate where the latest superblock is. Sure that wastes a little space. But that is not much considering the total number of zones of a drive. The latest 28 TB SMR drives have over 100,000 zones. > We've got provisions for variable sized zones, are SMR hard drives doing > anything with this? Or perhaps for a normal, random-overwritable zone at > the start? No, variable zone size is not a thing with SMR. bcachefs may support it, but in general, that makes zone management much harder and the kernel does not allow this (blk_revalidate_disk_zones() will return an error if it sees such drive). Host managed SMR drives generally have a small number of conventional zones (randomly writeable) at LBA 0. Generally about 1% of the total capacity/number of zones, so about 1000 conventional zones. This is optional though but most drives I know have that, except the special 128MB zone size one mentioned above which is all SMR zones. -- Damien Le Moal Western Digital Research