Hi Damien and Christoph, This patch series improves small write IOPS by a factor of four (+300%) for zoned UFS devices on my test setup with an UFSHCI 3.0 controller. Although you are probably busy because the merge window is open, please take a look at this patch series when you have the time. This patch series is organized as follows: - Bug fixes for existing code at the start of the series. - The write pipelining support implementation comes after the bug fixes. Thanks, Bart. Changes compared to v15: - Reworked this patch series on top of the zone write plugging approach. - Moved support for requeuing requests from the SCSI core into the block layer core. - In the UFS driver, instead of disabling write pipelining if auto-hibernation is enabled, rely on the requeuing mechanism to handle reordering caused by resuming from auto-hibernation. Changes compared to v14: - Removed the drivers/scsi/Kconfig.kunit and drivers/scsi/Makefile.kunit files. Instead, modified drivers/scsi/Kconfig and added #include "*_test.c" directives in the appropriate .c files. Removed the EXPORT_SYMBOL() directives that were added to make the unit tests link. - Fixed a double free in a unit test. Changes compared to v13: - Reworked patch "block: Preserve the order of requeued zoned writes". - Addressed a performance concern by removing the eh_needs_prepare_resubmit SCSI driver callback and by introducing the SCSI host template flag .needs_prepare_resubmit instead. - Added a patch that adds a 'host' argument to scsi_eh_flush_done_q(). - Made the code in unit tests less repetitive. Changes compared to v12: - Added two new patches: "block: Preserve the order of requeued zoned writes" and "scsi: sd: Add a unit test for sd_cmp_sector()" - Restricted the number of zoned write retries. To my surprise I had to add "&& scmd->retries <= scmd->allowed" in the SCSI error handler to limit the number of retries. - In patch "scsi: ufs: Inform the block layer about write ordering", only set ELEVATOR_F_ZBD_SEQ_WRITE for zoned block devices. Changes compared to v11: - Fixed a NULL pointer dereference that happened when booting from an ATA device by adding an scmd->device != NULL check in scsi_needs_preparation(). - Updated Reviewed-by tags. Changes compared to v10: - Dropped the UFS MediaTek and HiSilicon patches because these are not correct and because it is safe to drop these patches. - Updated Acked-by / Reviewed-by tags. Changes compared to v9: - Introduced an additional scsi_driver callback: .eh_needs_prepare_resubmit(). - Renamed the scsi_debug kernel module parameter 'no_zone_write_lock' into 'preserves_write_order'. - Fixed an out-of-bounds access in the unit scsi_call_prepare_resubmit() unit test. - Wrapped ufshcd_auto_hibern8_update() calls in UFS host drivers with WARN_ON_ONCE() such that a kernel stack appears in case an error code is returned. - Elaborated a comment in the UFSHCI driver. Changes compared to v8: - Fixed handling of 'driver_preserves_write_order' and 'use_zone_write_lock' in blk_stack_limits(). - Added a comment in disk_set_zoned(). - Modified blk_req_needs_zone_write_lock() such that it returns false if q->limits.use_zone_write_lock is false. - Modified disk_clear_zone_settings() such that it clears q->limits.use_zone_write_lock. - Left out one change from the mq-deadline patch that became superfluous due to the blk_req_needs_zone_write_lock() change. - Modified scsi_call_prepare_resubmit() such that it only calls list_sort() if zoned writes have to be resubmitted for which zone write locking is disabled. - Added an additional unit test for scsi_call_prepare_resubmit(). - Modified the sorting code in the sd driver such that only those SCSI commands are sorted for which write locking is disabled. - Modified sd_zbc.c such that ELEVATOR_F_ZBD_SEQ_WRITE is only set if the write order is not preserved. - Included three patches for UFS host drivers that rework code that wrote directly to the auto-hibernation controller register. - Modified the UFS driver such that enabling auto-hibernation is not allowed if a zoned logical unit is present and if the controller operates in legacy mode. - Also in the UFS driver, simplified ufshcd_auto_hibern8_update(). Changes compared to v7: - Split the queue_limits member variable `use_zone_write_lock' into two member variables: `use_zone_write_lock' (set by disk_set_zoned()) and `driver_preserves_write_order' (set by the block driver or SCSI LLD). This should clear up the confusion about the purpose of this variable. - Moved the code for sorting SCSI commands by LBA from the SCSI error handler into the SCSI disk (sd) driver as requested by Christoph. Changes compared to v6: - Removed QUEUE_FLAG_NO_ZONE_WRITE_LOCK and instead introduced a flag in the request queue limits data structure. Changes compared to v5: - Renamed scsi_cmp_lba() into scsi_cmp_sector(). - Improved several source code comments. Changes compared to v4: - Dropped the patch that introduces the REQ_NO_ZONE_WRITE_LOCK flag. - Dropped the null_blk patch and added two scsi_debug patches instead. - Dropped the f2fs patch. - Split the patch for the UFS driver into two patches. - Modified several patch descriptions and source code comments. - Renamed dd_use_write_locking() into dd_use_zone_write_locking(). - Moved the list_sort() call from scsi_unjam_host() into scsi_eh_flush_done_q() such that sorting happens just before reinserting. - Removed the scsi_cmd_retry_allowed() call from scsi_check_sense() to make sure that the retry counter is adjusted once per retry instead of twice. Changes compared to v3: - Restored the patch that introduces QUEUE_FLAG_NO_ZONE_WRITE_LOCK. That patch had accidentally been left out from v2. - In patch "block: Introduce the flag REQ_NO_ZONE_WRITE_LOCK", improved the patch description and added the function blk_no_zone_write_lock(). - In patch "block/mq-deadline: Only use zone locking if necessary", moved the blk_queue_is_zoned() call into dd_use_write_locking(). - In patch "fs/f2fs: Disable zone write locking", set REQ_NO_ZONE_WRITE_LOCK from inside __bio_alloc() instead of in f2fs_submit_write_bio(). Changes compared to v2: - Renamed the request queue flag for disabling zone write locking. - Introduced a new request flag for disabling zone write locking. - Modified the mq-deadline scheduler such that zone write locking is only disabled if both flags are set. - Added an F2FS patch that sets the request flag for disabling zone write locking. - Only disable zone write locking in the UFS driver if auto-hibernation is disabled. Changes compared to v1: - Left out the patches that are already upstream. - Switched the approach in patch "scsi: Retry unaligned zoned writes" from retrying immediately to sending unaligned write commands to the SCSI error handler. Bart Van Assche (26): blk-zoned: Fix a reference count leak blk-zoned: Split disk_zone_wplugs_work() blk-zoned: Split queue_zone_wplugs_show() blk-zoned: Only handle errors after pending zoned writes have completed blk-zoned: Fix a deadlock triggered by unaligned writes blk-zoned: Fix requeuing of zoned writes block: Support block drivers that preserve the order of write requests dm-linear: Report to the block layer that the write order is preserved mq-deadline: Remove a local variable blk-mq: Clean up blk_mq_requeue_work() block: Optimize blk_mq_submit_bio() for the cache hit scenario block: Rework request allocation in blk_mq_submit_bio() block: Support allocating from a specific software queue blk-mq: Restore the zoned write order when requeuing blk-zoned: Document the locking order blk-zoned: Document locking assumptions blk-zoned: Uninline functions that are not in the hot path blk-zoned: Make disk_should_remove_zone_wplug() more robust blk-zoned: Add an argument to blk_zone_plug_bio() blk-zoned: Support pipelining of zoned writes scsi: core: Retry unaligned zoned writes scsi: sd: Increase retry count for zoned writes scsi: scsi_debug: Add the preserves_write_order module parameter scsi: scsi_debug: Support injecting unaligned write errors scsi: scsi_debug: Skip host/bus reset settle delay scsi: ufs: Inform the block layer about write ordering block/bfq-iosched.c | 2 + block/blk-mq.c | 97 ++++++---- block/blk-mq.h | 3 + block/blk-settings.c | 2 + block/blk-zoned.c | 376 +++++++++++++++++++++++++------------- block/blk.h | 33 +++- block/kyber-iosched.c | 2 + block/mq-deadline.c | 10 +- drivers/md/dm-linear.c | 6 + drivers/md/dm.c | 2 +- drivers/scsi/scsi_debug.c | 22 ++- drivers/scsi/scsi_error.c | 16 ++ drivers/scsi/sd.c | 7 + drivers/ufs/core/ufshcd.c | 7 + include/linux/blk-mq.h | 20 +- include/linux/blkdev.h | 11 +- 16 files changed, 444 insertions(+), 172 deletions(-)