On 5/27/23 00:48, Brian Bunker wrote: > Hello, > > One of our customers reported a significant regression in the > performance of their iSCSI rescan when they upgraded their initiator > Linux kernel: > > https://forum.proxmox.com/threads/kernel-5-15-usr-bin-iscsiadm-mode-session-sid-x-rescan-really-slow.110113/ > > This was determined not to be an array side issue, but I chased the > problem for him. The issue comes down to a patch: > > commit 508aebb805277c541e94ee14daba4191ff02347e > Author: Damien Le Moal <damien.lemoal@xxxxxxx> > Date: Wed Jan 27 20:47:32 2021 > > block: introduce blk_queue_clear_zone_settings() > > When I connect 255 volumes with 2 paths to each and run an iSCSI > rescan there is a significant difference in the time it takes. The > rescan of iscsiadm rescan is a parallel sequential scan of the 255 > volumes on both paths. It comes down to this for each device: > > [root@init107-18 boot]# cd /sys/bus/scsi/devices/11\:0\:0\:1 > [root@init107-18 11:0:0:1]# echo 1 > rescan > [root@init107-18 boot]# cd /sys/bus/scsi/devices/10\:0\:0\:1 > [root@init107-18 10:0:0:1]# echo 1 > rescan > ... > > (As 5.11.0-rc5+) > Without this patch: > Command being timed: "iscsiadm --mode session --rescan" > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.04 > > Just adding this patch on the previous: > Command being timed: "iscsiadm --mode session --rescan" > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.45 > > In the most recent Linux kernel, 6.4.0-rc3+, the regression is not as > pronounced but is still significant. > > With: > Command being timed: "iscsiadm --mode session --rescan" > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.84 > > Without: > Command being timed: "iscsiadm --mode session --rescan" > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.53 > > With the second being only the result of: > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -953,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum > blk_zoned_model model) > blk_queue_zone_write_granularity(q, > queue_logical_block_size(q)); > } else { > - disk_clear_zone_settings(disk); > + /* disk_clear_zone_settings(disk); */ > } > } > EXPORT_SYMBOL_GPL(disk_set_zoned); > > From what I can tell this patch is trying to account for a change in > zoned behavior moving to none. It looks like it is saying that there > is no good way to tell between this moving to none and never reporting > block zoned capabilities at all. The penalty on targets which don't > support zoned capabilities at all seems pretty steep. Is there a > better way to get what is needed here without affecting disks which > are not zoned capable? > > Let me know if you need any more details on this. Can you try this and see if that restores rescan times for your system ? diff --git a/block/blk-settings.c b/block/blk-settings.c index 896b4654ab00..4dd59059b788 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -915,6 +915,7 @@ static bool disk_has_partitions(struct gendisk *disk) void disk_set_zoned(struct gendisk *disk, enum blk_zoned_model model) { struct request_queue *q = disk->queue; + unsigned int old_model = q->limits.zoned; switch (model) { case BLK_ZONED_HM: @@ -952,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum blk_zoned_model model) */ blk_queue_zone_write_granularity(q, queue_logical_block_size(q)); - } else { + } else if (old_model != BLK_ZONED_NONE) { disk_clear_zone_settings(disk); } } > > Thanks, > Brian Bunker > PURE Storage, Inc. -- Damien Le Moal Western Digital Research