Re: block: significant performance regression in iSCSI rescan

Damien Le Moal <dlemoal@xxxxxxxxxx> · Sat, 27 May 2023 08:31:31 +0900

On 5/27/23 00:48, Brian Bunker wrote:
> Hello,
> 
> One of our customers reported a significant regression in the
> performance of their iSCSI rescan when they upgraded their initiator
> Linux kernel:
> 
> https://forum.proxmox.com/threads/kernel-5-15-usr-bin-iscsiadm-mode-session-sid-x-rescan-really-slow.110113/
> 
> This was determined not to be an array side issue, but I chased the
> problem for him. The issue comes down to a patch:
> 
> commit 508aebb805277c541e94ee14daba4191ff02347e
> Author: Damien Le Moal <damien.lemoal@xxxxxxx>
> Date:   Wed Jan 27 20:47:32 2021
> 
>     block: introduce blk_queue_clear_zone_settings()
> 
> When I connect 255 volumes with 2 paths to each and run an iSCSI
> rescan there is a significant difference in the time it takes. The
> rescan of iscsiadm rescan is a parallel sequential scan of the 255
> volumes on both paths. It comes down to this for each device:
> 
> [root@init107-18 boot]# cd /sys/bus/scsi/devices/11\:0\:0\:1
> [root@init107-18 11:0:0:1]# echo 1 > rescan
> [root@init107-18 boot]# cd /sys/bus/scsi/devices/10\:0\:0\:1
> [root@init107-18 10:0:0:1]# echo 1 > rescan
> ...
> 
> (As 5.11.0-rc5+)
> Without this patch:
> Command being timed: "iscsiadm --mode session --rescan"
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.04
> 
> Just adding this patch on the previous:
> Command being timed: "iscsiadm --mode session --rescan"
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.45
> 
> In the most recent Linux kernel, 6.4.0-rc3+, the regression is not as
> pronounced but is still significant.
> 
> With:
> Command being timed: "iscsiadm --mode session --rescan"
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.84
> 
> Without:
> Command being timed: "iscsiadm --mode session --rescan"
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.53
> 
> With the second being only the result of:
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -953,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum
> blk_zoned_model model)
>                 blk_queue_zone_write_granularity(q,
>                                                 queue_logical_block_size(q));
>         } else {
> -               disk_clear_zone_settings(disk);
> +               /* disk_clear_zone_settings(disk); */
>         }
>  }
>  EXPORT_SYMBOL_GPL(disk_set_zoned);
> 
> From what I can tell this patch is trying to account for a change in
> zoned behavior moving to none. It looks like it is saying that there
> is no good way to tell between this moving to none and never reporting
> block zoned capabilities at all. The penalty on targets which don't
> support zoned capabilities at all seems pretty steep. Is there a
> better way to get what is needed here without affecting disks which
> are not zoned capable?
> 
> Let me know if you need any more details on this.

Can you try this and see if that restores rescan times for your system ?

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 896b4654ab00..4dd59059b788 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -915,6 +915,7 @@ static bool disk_has_partitions(struct gendisk *disk)
 void disk_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
 {
        struct request_queue *q = disk->queue;
+       unsigned int old_model = q->limits.zoned;

        switch (model) {
        case BLK_ZONED_HM:
@@ -952,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum
blk_zoned_model model)
                 */
                blk_queue_zone_write_granularity(q,
                                                queue_logical_block_size(q));
-       } else {
+       } else if (old_model != BLK_ZONED_NONE) {
                disk_clear_zone_settings(disk);
        }
 }


> 
> Thanks,
> Brian Bunker
> PURE Storage, Inc.

-- 
Damien Le Moal
Western Digital Research