Re: [PATCH v7 1/7] block: Introduce the use_zone_write_lock member variable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/9/23 18:33, Damien Le Moal wrote:
On 8/10/23 05:23, Bart Van Assche wrote:
Writes in sequential write required zones must happen at the write
pointer. Even if the submitter of the write commands (e.g. a filesystem)
submits writes for sequential write required zones in order, the block
layer or the storage controller may reorder these write commands.

The zone locking mechanism in the mq-deadline I/O scheduler serializes
write commands for sequential zones. Some but not all storage controllers
require this serialization. Introduce a new request queue limit member
variable to allow block drivers to indicate that they preserve the order
of write commands and thus do not require serialization of writes per
zone.

Cc: Damien Le Moal <dlemoal@xxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Ming Lei <ming.lei@xxxxxxxxxx>
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
  block/blk-settings.c   | 6 ++++++
  include/linux/blkdev.h | 1 +
  2 files changed, 7 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 0046b447268f..b75c97971860 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -56,6 +56,7 @@ void blk_set_default_limits(struct queue_limits *lim)
  	lim->alignment_offset = 0;
  	lim->io_opt = 0;
  	lim->misaligned = 0;
+	lim->use_zone_write_lock = true;
  	lim->zoned = BLK_ZONED_NONE;

Given that the default for zoned is BLK_ZONED_NONE, having use_zone_write_lock
default to true is strange. It would be better to set the default to false and
have disk_set_zoned() set it to true if needed, with an additional argument to
specify if it should be the case or not. E.g., for SMR drives, sd.c would call
something like:

disk_set_zoned(sdkp->disk, BLK_ZONED_HM, sdp->use_zone_write_lock);

sd.c would default to sdp->use_zone_write_lock == true and UFS driver can set
it to false. That would be cleaner I think.

Hi Damien,

Thanks for the detailed feedback.

My concerns about the above proposal are as follows:
* The information about whether or not the zone write lock should be used comes
  from the block driver, e.g. a SCSI LLD.
* sdp, the SCSI disk pointer, is owned by the ULD.
* An ULD may be attached and detached multiple times during the lifetime of a
  logical unit without the LLD being informed about this. So how to set
  sdp->use_zone_write_lock without introducing a new callback or member variable
  in a data structure owned by the LLD?

Hence my preference to store use_zone_write_lock in a data structure that has the
same lifetime as the logical unit and not in any data structure controlled by the
ULD.

  	lim->zone_write_granularity = 0;
  	lim->dma_alignment = 511;
@@ -685,6 +686,11 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
  						   b->max_secure_erase_sectors);
  	t->zone_write_granularity = max(t->zone_write_granularity,
  					b->zone_write_granularity);
+	/*
+	 * Whether or not the zone write lock should be used depends on the
+	 * bottom driver only.
+	 */
+	t->use_zone_write_lock = b->use_zone_write_lock;

Given that DM bio targets do not have a scheduler and do not have a zone lock
bitmap allocated, I do not think this is necessary at all. This can remain to
false, thus in sync with the fact that there is no IO scheduler.

How about the request-based dm drivers (dm-mpath and dm-target)? Isn't the dm-mpath
driver request based because that allows the I/O scheduler to be configured on top
of the dm-mpath driver? From https://lwn.net/Articles/274292/: "The basic idea to
resolve the issue is to move multipathing layer down below the I/O scheduler, and it
was proposed from Mike Christie as the block layer (request-based) multipath".

Thanks,

Bart.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux