From: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx> Introduce a new target: lzbd - LightNVM Zoned Block Device The new target makes it possible to expose an Open-Channel 2.0 SSD as one or more zoned block devices exposing BLK_ZONE_TYPE_SEQWRITE_REQ zones. I've been playing around with this the last couple of months and now I'd love to get some feedback. It's very been useful to look at null_blk's zone support when doing the plumbing work and Simon and Klaus has also been very helpful when figuring out the design. Thanks guys! Naming is sometimes the hardest thing. I named this thing lzbd, as I found that most descriptive acronym. NOTE: This is an early prototype and lacking some vital features at the moment. It is worth looking at and playing around with for those interested, but beware of dragons :) See the lzbd documentation(Documentation/lightnvm/lzbd.txt) for my ideas on how a full implementation would look like. What is supported(for now): * Reads * Sequential writes * Unaligned writes (a per-zone ws_opt alignment buffer is used) * Zone resets * Zone reporting * Wear leveling(sort of, wear indices are not upated on reset yet) I've mainly tested in QEMU (cunits=0, ws_min=4, ws_opt=8). The zoned block device tests in blktests (tests/zbd) passes, and I've done a bunch of general smoke testing(aligned/unaligned writes with verification using dd and fio, ..), so the general plumbing seems to hold up, but more testing is needed. Performance is definately not what it should be yet. Only one chunk per zone is being written to at a time, effectively rate-limiting writes per zone, which is an interesting constraint, but probably not what we want. What is not supported(yet): * Metadata persistance (when the instance is removed, data is lost) - Zone to chunks mapping needs to be stored * Sync handling (flushing alignment buffers) - Zone Aligment buffer needs to be flushed to disk * Write error handling - Write errors will require zone -> chunk remapping of the failing chunk. * Chuck reset error handling (chunks going offline) * Updating wear indices on chunk resets - This is low hanging fruit to fix * Cunits read buffering Final thoughts, for now: Since lzbd (and pblk for that matter) are not entirely unlike file systems, it would be nice to create a mkfs/fsck/dmzadm-like tool that would: * Format the drive and persist instance configuration in a superblock contained in the instance metadata. * Repair broken(i.e. powerfailed) instances Per-sector metadata is currently not utilized in lzbd, but would be helpful in recovery scenarios. The patch is based on Matias for5.2/core branch in the github openchannel project. It is also available at [1] (branch for-5.2/lzbd) Thanks, Hans [1] CNEX Labs linux github project: https://github.com/CNEX-Labs/linux Hans Holmberg (1): lightnvm: add lzbd - a zoned block device target Documentation/lightnvm/lzbd.txt | 122 +++++++++++ drivers/lightnvm/Kconfig | 11 + drivers/lightnvm/Makefile | 3 + drivers/lightnvm/lzbd-io.c | 342 +++++++++++++++++++++++++++++++ drivers/lightnvm/lzbd-target.c | 392 +++++++++++++++++++++++++++++++++++ drivers/lightnvm/lzbd-user.c | 310 ++++++++++++++++++++++++++++ drivers/lightnvm/lzbd-zone.c | 444 ++++++++++++++++++++++++++++++++++++++++ drivers/lightnvm/lzbd.h | 139 +++++++++++++ 8 files changed, 1763 insertions(+) create mode 100644 Documentation/lightnvm/lzbd.txt create mode 100644 drivers/lightnvm/lzbd-io.c create mode 100644 drivers/lightnvm/lzbd-target.c create mode 100644 drivers/lightnvm/lzbd-user.c create mode 100644 drivers/lightnvm/lzbd-zone.c create mode 100644 drivers/lightnvm/lzbd.h -- 2.7.4