On 2020/06/26 15:09, Javier González wrote: > On 26.06.2020 01:34, Damien Le Moal wrote: >> On 2020/06/25 21:22, Javier González wrote: >>> From: Javier González <javier.gonz@xxxxxxxxxxx> >>> >>> Add support for offline transition on the zoned block device using the >>> new zone management IOCTL >>> >>> Signed-off-by: Javier González <javier.gonz@xxxxxxxxxxx> >>> Signed-off-by: SelvaKumar S <selvakuma.s1@xxxxxxxxxxx> >>> Signed-off-by: Kanchan Joshi <joshi.k@xxxxxxxxxxx> >>> Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> >>> --- >>> block/blk-core.c | 2 ++ >>> block/blk-zoned.c | 3 +++ >>> drivers/nvme/host/core.c | 3 +++ >>> include/linux/blk_types.h | 3 +++ >>> include/linux/blkdev.h | 1 - >>> include/uapi/linux/blkzoned.h | 1 + >>> 6 files changed, 12 insertions(+), 1 deletion(-) >>> >>> diff --git a/block/blk-core.c b/block/blk-core.c >>> index 03252af8c82c..589cbdacc5ec 100644 >>> --- a/block/blk-core.c >>> +++ b/block/blk-core.c >>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = { >>> REQ_OP_NAME(ZONE_CLOSE), >>> REQ_OP_NAME(ZONE_FINISH), >>> REQ_OP_NAME(ZONE_APPEND), >>> + REQ_OP_NAME(ZONE_OFFLINE), >>> REQ_OP_NAME(WRITE_SAME), >>> REQ_OP_NAME(WRITE_ZEROES), >>> REQ_OP_NAME(SCSI_IN), >>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio) >>> case REQ_OP_ZONE_OPEN: >>> case REQ_OP_ZONE_CLOSE: >>> case REQ_OP_ZONE_FINISH: >>> + case REQ_OP_ZONE_OFFLINE: >>> if (!blk_queue_is_zoned(q)) >>> goto not_supported; >>> break; >>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c >>> index 29194388a1bb..704fc15813d1 100644 >>> --- a/block/blk-zoned.c >>> +++ b/block/blk-zoned.c >>> @@ -416,6 +416,9 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode, >>> case BLK_ZONE_MGMT_RESET: >>> op = REQ_OP_ZONE_RESET; >>> break; >>> + case BLK_ZONE_MGMT_OFFLINE: >>> + op = REQ_OP_ZONE_OFFLINE; >>> + break; >>> default: >>> return -ENOTTY; >>> } >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c >>> index f1215523792b..5b95c81d2a2d 100644 >>> --- a/drivers/nvme/host/core.c >>> +++ b/drivers/nvme/host/core.c >>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req, >>> case REQ_OP_ZONE_FINISH: >>> ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH); >>> break; >>> + case REQ_OP_ZONE_OFFLINE: >>> + ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE); >>> + break; >>> case REQ_OP_WRITE_ZEROES: >>> ret = nvme_setup_write_zeroes(ns, req, cmd); >>> break; >>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >>> index 16b57fb2b99c..b3921263c3dd 100644 >>> --- a/include/linux/blk_types.h >>> +++ b/include/linux/blk_types.h >>> @@ -316,6 +316,8 @@ enum req_opf { >>> REQ_OP_ZONE_FINISH = 12, >>> /* write data at the current zone write pointer */ >>> REQ_OP_ZONE_APPEND = 13, >>> + /* Transition a zone to offline */ >>> + REQ_OP_ZONE_OFFLINE = 14, >>> >>> /* SCSI passthrough using struct scsi_request */ >>> REQ_OP_SCSI_IN = 32, >>> @@ -456,6 +458,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op) >>> case REQ_OP_ZONE_OPEN: >>> case REQ_OP_ZONE_CLOSE: >>> case REQ_OP_ZONE_FINISH: >>> + case REQ_OP_ZONE_OFFLINE: >>> return true; >>> default: >>> return false; >>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h >>> index bd8521f94dc4..8308d8a3720b 100644 >>> --- a/include/linux/blkdev.h >>> +++ b/include/linux/blkdev.h >>> @@ -372,7 +372,6 @@ extern int blkdev_zone_ops_ioctl(struct block_device *bdev, fmode_t mode, >>> unsigned int cmd, unsigned long arg); >>> extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode, >>> unsigned int cmd, unsigned long arg); >>> - >>> #else /* CONFIG_BLK_DEV_ZONED */ >>> >>> static inline unsigned int blkdev_nr_zones(struct gendisk *disk) >>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h >>> index a8c89fe58f97..d0978ee10fc7 100644 >>> --- a/include/uapi/linux/blkzoned.h >>> +++ b/include/uapi/linux/blkzoned.h >>> @@ -155,6 +155,7 @@ enum blk_zone_action { >>> BLK_ZONE_MGMT_FINISH = 0x2, >>> BLK_ZONE_MGMT_OPEN = 0x3, >>> BLK_ZONE_MGMT_RESET = 0x4, >>> + BLK_ZONE_MGMT_OFFLINE = 0x5, >>> }; >>> >>> /** >>> >> >> As mentioned in previous email, the usefulness of this is dubious. Please >> elaborate in the commit message. Sure NVMe ZNS defines this and we can support >> it. But without a good use case, what is the point ? > > Use case is to transition zones in read-only state to offline when we > are done moving valid data. It is easier to explicitly managing zones > that are not usable by having all under the offline state. Then adding a simple BLKZONEOFFLINE ioctl, similar to open, close, finish and reset, would be enough. No need for all the new zone management ioctl with flags plumbing. > >> >> scsi SD driver will return BLK_STS_NOTSUPP if this offlining is sent to a >> ZBC/ZAC drive. Not nice. Having a sysfs attribute "max_offline_zone_sectors" or >> the like to indicate support by the device or not would be nicer. > > We can do that. > >> >> Does offling ALL zones make any sense ? Because this patch does not prevent the >> use of the REQ_ZONE_ALL flags introduced in patch 2. Probably not a good idea to >> allow offlining all zones, no ? > > AFAIK the transition to offline is only valid when coming from a > read-only state. I did think of adding a check, but I can see that other > transitions go directly to the driver and then the device, so I decided > to follow the same model. If you think it is better, we can add the > check. My point was that the REQ_ZONE_ALL flag would make no sense for offlining zones but this patch does not have anything checking that. There is no point in sending a command that is known to be incorrect to the drive... > > Javier > -- Damien Le Moal Western Digital Research