On Fri, Sep 02 2022 at 4:55P -0400, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > On Tue, Aug 23 2022 at 8:18P -0400, > Pankaj Raghav <p.raghav@xxxxxxxxxxx> wrote: > > > Only zoned devices with power-of-2(po2) number of sectors per zone(zone > > size) were supported in linux but now non power-of-2(npo2) zone sizes > > support has been added to the block layer. > > > > Filesystems such as F2FS and btrfs have support for zoned devices with > > po2 zone size assumption. Before adding native support for npo2 zone > > sizes, it was suggested to create a dm target for npo2 zone size device to > > appear as a po2 zone size target so that file systems can initially > > work without any explicit changes by using this target. > > > > The design of this target is very simple: remap the device zone size to > > the zone capacity and change the zone size to be the nearest power of 2 > > value. > > > > For e.g., a device with a zone size/capacity of 3M will have an equivalent > > target layout as follows: > > > > Device layout :- > > zone capacity = 3M > > zone size = 3M > > > > |--------------|-------------| > > 0 3M 6M > > > > Target layout :- > > zone capacity=3M > > zone size = 4M > > > > |--------------|---|--------------|---| > > 0 3M 4M 7M 8M > > > > The area between target's zone capacity and zone size will be emulated > > in the target. > > The read IOs that fall in the emulated gap area will return 0 filled > > bio and all the other IOs in that area will result in an error. > > If a read IO span across the emulated area boundary, then the IOs are > > split across them. All other IO operations that span across the emulated > > area boundary will result in an error. > > > > The target can be easily created as follows: > > dmsetup create <label> --table '0 <size_sects> po2zone /dev/nvme<id>' > > > > Note that the target does not support partial mapping of the underlying > > device. > > > > Signed-off-by: Pankaj Raghav <p.raghav@xxxxxxxxxxx> > > Suggested-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> > > Suggested-by: Damien Le Moal <damien.lemoal@xxxxxxx> > > Suggested-by: Hannes Reinecke <hare@xxxxxxx> > > > This target needs more review from those who Suggested-by it. > > And the header and docs needs to address: > > 1) why is a partial mapping of the underlying device disallowed? > 2) why is it assumed all IO is read-only? (talk to me and others like > we don't know the inherent limitations of this class of zoned hw) > > On a code level: > 1) are you certain you're properly failing all writes? > - are writes allowed to the "zone capacity area" but _not_ > allowed to the "emulated zone area"? (if yes, _please document_). > 2) yes, you absolutely need to implement the .status target_type hook > (for both STATUS and TABLE). > 3) really not loving the nested return (of DM_MAPIO_SUBMITTED or > DM_MAPIO_REMAPPED) from methods called from dm_po2z_map(). Would > prefer to not have to do a depth-first search to see where and when > dm_po2z_map() returns a DM_MAPIO_XXX unless there is a solid > justification for it. To me it just obfuscates the DM interface a > bit too much. > > Otherwise, pretty clean code and nothing weird going on. > > I look forward to seeing your next (final?) revision of this patchset. Thinking further.. I'm left confused about just what the heck this target is assuming. E.g.: feels like its exposing a readonly end of the zone is very bi-polar... yet no hint to upper layer it shouldn't write to that read-only end (the "emulated zone").. but there has to be some zoned magic assumed? And I'm just naive? Mike