On 3/2/23 08:34, Khazhy Kumykov wrote: > HSMR HDDs are a type of SMR HDD that allow for a dynamic mixture of > CMR and SMR zones, allowing users to convert regions of the disk > between the two. The way this is implemented as specified by the SCSI > ZAC-2 specification is there’s a set of “CMR” regions and “SMR” > regions. These may be grouped into “realms” that may, as a group, be > online or offline. Zone management can bring online a domain/zone and > offline any corresponding domains/zones. > > I’d like to discuss what path makes sense for supporting these > devices, and also how to avoid potential issues specific to the “mixed > CMR & SMR IO traffic” use case - particularly around latency due to > potentially unneeded (from the perspective of an application) zone > management commands. Hard no on supporting these. See below. > > Points of Discussion > ==== > > - There’s already support in the kernel for marking zones > online/offline and cmr/smr, but this is fixed, not dynamic. Would > there be hiccups with allowing zones to come online/offline while > running? No, there is no support for "marking" zones offline (or read only): transitions into these states are not explicit due to any command execution, but determined by the drive, and asynchronous as far as the host is concerned. There is support for *detecting* offline zones though, so that FSes do not attempt to use these dead zones. But that is more part of error processing than the regular IO path because seeing offline zones is not expected, but rather, the result of a drive going bad. HMSMR would essentially allow users to explicitly offline zones, wreaking the IO path and potentially generating lots of IO errors. So HSMR support should only be allowed (if it ever is) to be controlled by a file system, not by the user. And if the user wants to do raw block device IOs, then it can use passthrough commands to control the activation state of zones. > - There may be multiple CMR “zones” that are contiguous in LBA space. > A benefit of HSMR disks is, to a certain extent, software which is > designed for all-CMR disks can work similarly on a contiguous CMR area > of the HSMR disk (modulo handling “resizes”). This may result in IO > that can straddle two CMR “zones”. It’s not a problem for writes to > span CMR zones, but it is for SMR zones, so this distinction is useful > to have in the block layer. Writes to CMR zones on regular host-managed SMR can straddle CMR zone boundaries too (but not CMR-to-SMR boundary). We do not allow it because micro optimizing for this case is not worth the overhead it introduces. So hard no on this. > - What makes sense as an interface for managing these types of > not-quite CMR and not quite SMR disks? Some of the featureset overlaps > with existing SMR support in blkdev_zone_mgmt_ioctl, so perhaps the > additional conversion commands could be added there? Passthrough commands. There are no kernel internal users of this, so I do not see any need to add an interface for activate/deactivate zones. libzbc v6 is coming soon with an API for zone domains/zone realms commands (already available with the zone-domains branch of the source code). > - mitigating & limiting tail latency effects due to report zones > commands / limiting “unnecessary” zone management calls. There is no implicit zone management commands issued by the kernel, except the one report zones done on disk scan/revalidate. Any zone management command is explicit, asked for by the user or FS using the drive. So that is up to the user to limit these to control the overhead. In general, support for hybrid SMR (Zone domains / zone realms) is a hard no from me. This feature set is a total nightmare to deal with in the kernel. It opens a ton of corner cases that will require lots of checks in the hot path. We definitely do not want that. -- Damien Le Moal Western Digital Research