Re: block: significant performance regression in iSCSI rescan

Brian Bunker <brian@xxxxxxxxxxxxxxx> · Sat, 27 May 2023 11:46:30 -0700



> On May 26, 2023, at 4:31 PM, Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:
> 
> On 5/27/23 00:48, Brian Bunker wrote:
>> Hello,
>> 
>> One of our customers reported a significant regression in the
>> performance of their iSCSI rescan when they upgraded their initiator
>> Linux kernel:
>> 
>> https://forum.proxmox.com/threads/kernel-5-15-usr-bin-iscsiadm-mode-session-sid-x-rescan-really-slow.110113/
>> 
>> This was determined not to be an array side issue, but I chased the
>> problem for him. The issue comes down to a patch:
>> 
>> commit 508aebb805277c541e94ee14daba4191ff02347e
>> Author: Damien Le Moal <damien.lemoal@xxxxxxx>
>> Date:   Wed Jan 27 20:47:32 2021
>> 
>>    block: introduce blk_queue_clear_zone_settings()
>> 
>> When I connect 255 volumes with 2 paths to each and run an iSCSI
>> rescan there is a significant difference in the time it takes. The
>> rescan of iscsiadm rescan is a parallel sequential scan of the 255
>> volumes on both paths. It comes down to this for each device:
>> 
>> [root@init107-18 boot]# cd /sys/bus/scsi/devices/11\:0\:0\:1
>> [root@init107-18 11:0:0:1]# echo 1 > rescan
>> [root@init107-18 boot]# cd /sys/bus/scsi/devices/10\:0\:0\:1
>> [root@init107-18 10:0:0:1]# echo 1 > rescan
>> ...
>> 
>> (As 5.11.0-rc5+)
>> Without this patch:
>> Command being timed: "iscsiadm --mode session --rescan"
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.04
>> 
>> Just adding this patch on the previous:
>> Command being timed: "iscsiadm --mode session --rescan"
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.45
>> 
>> In the most recent Linux kernel, 6.4.0-rc3+, the regression is not as
>> pronounced but is still significant.
>> 
>> With:
>> Command being timed: "iscsiadm --mode session --rescan"
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.84
>> 
>> Without:
>> Command being timed: "iscsiadm --mode session --rescan"
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.53
>> 
>> With the second being only the result of:
>> --- a/block/blk-settings.c
>> +++ b/block/blk-settings.c
>> @@ -953,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum
>> blk_zoned_model model)
>>                blk_queue_zone_write_granularity(q,
>>                                                queue_logical_block_size(q));
>>        } else {
>> -               disk_clear_zone_settings(disk);
>> +               /* disk_clear_zone_settings(disk); */
>>        }
>> }
>> EXPORT_SYMBOL_GPL(disk_set_zoned);
>> 
>> From what I can tell this patch is trying to account for a change in
>> zoned behavior moving to none. It looks like it is saying that there
>> is no good way to tell between this moving to none and never reporting
>> block zoned capabilities at all. The penalty on targets which don't
>> support zoned capabilities at all seems pretty steep. Is there a
>> better way to get what is needed here without affecting disks which
>> are not zoned capable?
>> 
>> Let me know if you need any more details on this.
> 
> Can you try this and see if that restores rescan times for your system ?
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 896b4654ab00..4dd59059b788 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -915,6 +915,7 @@ static bool disk_has_partitions(struct gendisk *disk)
> void disk_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
> {
>        struct request_queue *q = disk->queue;
> +       unsigned int old_model = q->limits.zoned;
> 
>        switch (model) {
>        case BLK_ZONED_HM:
> @@ -952,7 +953,7 @@ void disk_set_zoned(struct gendisk *disk, enum
> blk_zoned_model model)
>                 */
>                blk_queue_zone_write_granularity(q,
>                                                queue_logical_block_size(q));
> -       } else {
> +       } else if (old_model != BLK_ZONED_NONE) {
>                disk_clear_zone_settings(disk);
>        }
> }
Yes. This works to eliminate the delay since it doesn’t penalize the device that came in as BLK_ZONED_NONE.

Command being timed: "iscsiadm --mode session —rescan"
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.53

Thanks,
Brian
> 
>> 
>> Thanks,
>> Brian Bunker
>> PURE Storage, Inc.
> 
> -- 
> Damien Le Moal
> Western Digital Research