I agree with Damien, but I'd also add that in the future there may very well be some new Zone types added to the ZBC model. So we shouldn't assume that the ZBC model is a fixed one. And who knows? Perhaps T10 standards body will come up with a simpler model for interfacing with SCSI/SATA-attached SSD's that might leverage the ZBC model --- or not. Either way, that's not really relevant as far as the Linux block layer is concerned, since the Linux block layer is designed to be an abstraction on top of hardware --- and in some cases we can use a similar abstraction on top of eMMC's, SCSI's, and SATA's implementation definition of TRIM/DISCARD/WRITE SAME/SECURE TRIM/QUEUED TRIM, even though they are different in some subtle ways, and may have different performance characteristics and semantics. The trick is to expose similarities where the differences won't matter to the upper layers, but also to expose the fine distinctions and allow the file system and/or user space to use the protocol-specific differences when it matters to them. Designing that is going to be important, and I can guarantee we won't get it right at first. Which is why it's a good thing that internal kernel interfaces aren't cast into concrete, and can be subject to change as new revisions to ZBC, or new interfaces (like perhaps OCSSD's) get promulgated by various standards bodies or by various vendors. > > Another point that QLC device could have more tricky features of > > erase blocks management. Also we should apply erase operation on NAND > > flash erase block but it is not mandatory for the case of SMR zone. > > Incorrect: host-managed devices require a zone "reset" (equivalent to > discard/trim) to be reused after being written once. So again, the > "tricky features" you mention will depend on the device "model", > whatever this ends up to be for an open channel SSD. ... and this is exposed by having different zone types (sequential write required vs sequential write preferred vs conventional). And if OCSSD's "zones" don't fit into the current ZBC zone types, we can easily add new ones. I would suggest however, that we explicitly disclaim that the block device layer's code points for zone types is an exact match with the ZBC zone types numbering, precisely so we can add new zone types that correspond to abstractions from different hardware types, such as OCSSD. > Not necessarily. Again think in terms of device "model" and associated > feature set. An FS implementation may decide to support all possible > models, with likely a resulting incredible complexity. More likely, > similarly with what is happening with SMR, only models that make sense > will be supported by FS implementation that can be easily modified. > Example again here of f2fs: changes to support SMR were rather simple, > whereas the initial effort to support SMR with ext4 was pretty much > abandoned as it was too complex to integrate in the existing code while > keeping the existing on-disk format. I'll note that Abutalib Aghayev and I will be presenting a paper at the 2017 FAST conference detailing a way to optimize ext4 for Host-Aware SMR drives by making a surprisingly small set of changes to ext4's journalling layer, with some very promising performance improvements for certain workloads, which we tested on both Seagate and WD HA drives and achieved 2x performance improvements. Patches are on the unstable portion of the ext4 patch queue, and I hope to get them into an upstream acceptable shape (as opposed to "good enough for a research paper") in the next few months. So it may very well be that small changes can be made to file systems to support exotic devices if there are ways that we can expose the right information about underlying storage devices, and offering the right abstractions to enable the right kind of minimal I/O tagging, or hints, or commands as necessary such that the changes we do need to make to the file system can be kept small, and kept easily testable even if hardware is not available. For example, by creating device mapper emulators of the feature sets of these advanced storage interfaces that are exposed via the block layer abstractions, whether it be for ZBC zones, or hardware encryption acceleration, etc. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html