On 2/2/24 16:30, Damien Le Moal wrote: > The patch series introduces zone write plugging (ZWP) as the new > mechanism to control the ordering of writes to zoned block devices. > ZWP replaces zone write locking (ZWL) which is implemented only by > mq-deadline today. ZWP also allows emulating zone append operations > using regular writes for zoned devices that do not natively support this > operation (e.g. SMR HDDs). This patch series removes the scsi disk > driver and device mapper zone append emulation to use ZWP emulation. > > Unlike ZWL which operates on requests, ZWP operates on BIOs. A zone > write plug is simply a BIO list that is atomically manipulated using a > spinlock and a kblockd submission work. A write BIO to a zone is > "plugged" to delay its execution if a write BIO for the same zone was > already issued, that is, if a write request for the same zone is being > executed. The next plugged BIO is unplugged and issued once the write > request completes. > > This mechanism allows to: > - Untangle zone write ordering from the block IO schedulers. This > allows removing the restriction on using only mq-deadline for zoned > block devices. Any block IO scheduler, including "none" can be used. > - Zone write plugging operates on BIOs instead of requests. Plugged > BIOs waiting for execution thus do not hold scheduling tags and thus > do not prevent other BIOs from being submitted to the device (reads > or writes to other zones). Depending on the workload, this can > significantly improve the device use and the performance. > - Both blk-mq (request) based zoned devices and BIO-based devices (e.g. > device mapper) can use ZWP. It is mandatory for the > former but optional for the latter: BIO-based driver can use zone > write plugging to implement write ordering guarantees, or the drivers > can implement their own if needed. > - The code is less invasive in the block layer and in device drivers. > ZWP implementation is mostly limited to blk-zoned.c, with some small > changes in blk-mq.c, blk-merge.c and bio.c. > > Performance evaluation results are shown below. > > The series is organized as follows: I forgot to mention that the patches are against Jens block/for-next branch with the addition of Christoph's "clean up blk_mq_submit_bio" patches [1] and my patch "null_blk: Always split BIOs to respect queue limits" [2]. [1] https://lore.kernel.org/linux-block/20240124092658.2258309-1-hch@xxxxxx/ [2] https://lore.kernel.org/linux-block/20240126005032.1985245-1-dlemoal@xxxxxxxxxx/ -- Damien Le Moal Western Digital Research