Re: [PATCH] blk-mq: don't queue passthrough request into scheduler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi.
Don't reproduce the issue with your patch, but hit other issue, please
have a look.

[  136.002469] qla2xxx [0000:af:00.6]-800e:2: DEVICE RESET SUCCEEDED
nexus:2:0:0 cmd=000000008d9455f0.
[  136.011591] qla2xxx [0000:af:00.6]-8009:2: TARGET RESET ISSUED
nexus=2:0 cmd=000000008d9455f0.
[  136.206701] qla2xxx [0000:af:00.6]-800e:2: TARGET RESET SUCCEEDED
nexus:2:0 cmd=000000008d9455f0.
[  168.768518] qla2xxx [0000:af:00.7]-801c:4: Abort command issued
nexus=4:0:0 -- 2003.
[  168.776324] qla2xxx [0000:af:00.7]-8009:4: DEVICE RESET ISSUED
nexus=4:0:0 cmd=000000005642c8ad.
[  168.836726] qla2xxx [0000:af:00.7]-5032:4: ABT_IOCB: Invalid
completion handle (21) -- timed-out.
[  168.845758] qla2xxx [0000:af:00.7]-800e:4: DEVICE RESET SUCCEEDED
nexus:4:0:0 cmd=000000005642c8ad.
[  168.854881] qla2xxx [0000:af:00.7]-8018:4: ADAPTER RESET ISSUED nexus=4:0:0.
[  168.861994] qla2xxx [0000:af:00.7]-00af:4: Performing ISP error
recovery - ha=00000000f7f90a13.
[  170.944477] qla2xxx [0000:af:00.7]-00b4:4: Done chip reset cleanup.
[  170.960311] qla2xxx [0000:af:00.7]-00d2:4: Init Firmware **** FAILED ****.
[  170.967241] qla2xxx [0000:af:00.7]-8016:4: qla82xx_restart_isp ****
FAILED ****.
[  170.974693] qla2xxx [0000:af:00.7]-b02f:4: HW State: NEED RESET
[  170.980671] qla2xxx [0000:af:00.7]-009b:4: Device state is 0x4 = Need Reset.
[  170.987784] qla2xxx [0000:af:00.7]-009d:4: Device state is 0x4 = Need Reset.
[  171.968416] qla2xxx [0000:af:00.6]-6001:2: Adapter reset needed.
[  171.974549] qla2xxx [0000:af:00.6]-b031:2: Device state is 0x4 = Need Reset.
[  171.981680] qla2xxx [0000:af:00.6]-009b:2: Device state is 0x4 = Need Reset.
[  171.988776] qla2xxx [0000:af:00.6]-009d:2: Device state is 0x4 = Need Reset.
[  171.995872] qla2xxx [0000:af:00.6]-00af:2: Performing ISP error
recovery - ha=0000000071956e30.
[  174.080409] qla2xxx [0000:af:00.6]-00b4:2: Done chip reset cleanup.
[  174.086822] qla2xxx [0000:af:00.6]-801c:2: Abort command issued
nexus=2:0:0 -- 2002.
[  174.094635] qla2xxx [0000:af:00.6]-8018:2: ADAPTER RESET ISSUED nexus=2:0:0.
[  174.101728] qla2xxx [0000:af:00.6]-8017:2: ADAPTER RESET FAILED nexus=2:0:0.
[  174.108910] scsi 2:0:0:0: rejecting I/O to offline device
[  175.104377] qla2xxx [0000:af:00.7]-00b6:4: Device state is 0x4 = Need Reset.
[  175.111480] qla2xxx [0000:af:00.7]-00b7:4: HW State: COLD/RE-INIT.
[  177.184324] ------------[ cut here ]------------
[  177.188972] WARNING: CPU: 7 PID: 813 at
drivers/scsi/qla2xxx/qla_nx.c:507
qla82xx_need_reset_handler+0x1a4/0x470 [qla2xxx]
[  177.200131] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
dns_resolver nfs lockd grace fscache netfs rfkill sunrpc vfat fat
dm_multipath intel_rapl_msr intel_rapl_common intel_uncore_frequency
intel_uncore_frequency_common isst_if_common nfit libnvdimm mgag200
x86_pkg_temp_thermal i2c_algo_bit intel_powerclamp ipmi_ssif coretemp
drm_shmem_helper drm_kms_helper rapl syscopyarea mei_me sysfillrect
ses sysimgblt intel_cstate enclosure acpi_ipmi ioatdma acpi_tad
ipmi_si lpc_ich intel_pch_thermal intel_uncore acpi_power_meter mei
ipmi_devintf pcspkr dca hpilo ipmi_msghandler drm fuse xfs sd_mod sg
qla2xxx qla4xxx bnx2x nvme_fc nvme nvme_fabrics crct10dif_pclmul
crc32_pclmul nvme_core libiscsi smartpqi ghash_clmulni_intel
scsi_transport_iscsi nvme_common uas scsi_transport_fc t10_pi mdio
iscsi_boot_sysfs libcrc32c usb_storage scsi_transport_sas crc32c_intel
tg3 hpwdt wmi dm_mirror dm_region_hash dm_log dm_mod
[  177.281274] CPU: 7 PID: 813 Comm: qla2xxx_2_dpc Kdump: loaded Not
tainted 6.4.0-rc1+ #1
[  177.289328] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380
Gen10, BIOS U30 03/09/2020
[  177.297903] RIP: 0010:qla82xx_need_reset_handler+0x1a4/0x470 [qla2xxx]
[  177.304513] Code: a6 cd e8 8f 49 4f ce eb 0a bf 64 00 00 00 e8 63
0d a6 cd be 28 c0 11 06 48 89 ef e8 16 a9 ff ff 83 f8 01 74 07 83 eb
01 75 df <0f> 0b be 44 21 20 08 48 89 ef e8 fd a8 ff ff be 38 21 20 08
48 89
[  177.323398] RSP: 0018:ffffaf0b0508be48 EFLAGS: 00010246
[  177.328656] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000130000
[  177.335834] RDX: ffffaf0b0508be00 RSI: ffffaf0b0593c028 RDI: ffff9b2420022000
[  177.343013] RBP: ffff9b2420022000 R08: 0000000000120000 R09: ffff9b272fff2878
[  177.350193] R10: 0000000000000162 R11: 0000000003c23c49 R12: 00000000fffe3bdb
[  177.357373] R13: 0000000000000000 R14: ffff9b274fca6840 R15: 0000000001000000
[  177.364554] FS:  0000000000000000(0000) GS:ffff9b272ffc0000(0000)
knlGS:0000000000000000
[  177.372695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  177.378476] CR2: 000055b7803d43d8 CR3: 000000079c220001 CR4: 00000000007706e0
[  177.385656] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  177.392836] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  177.400015] PKRU: 55555554
[  177.402738] Call Trace:
[  177.405201]  <TASK>
[  177.407316]  qla82xx_device_state_handler+0x285/0x4a0 [qla2xxx]
[  177.413303]  ? __pfx_qla2x00_do_dpc+0x10/0x10 [qla2xxx]
[  177.418588]  qla82xx_abort_isp+0x190/0x290 [qla2xxx]
[  177.423613]  qla2x00_do_dpc+0x743/0xa60 [qla2xxx]
[  177.428374]  ? __pfx_qla2x00_do_dpc+0x10/0x10 [qla2xxx]
[  177.433655]  kthread+0xdf/0x110
[  177.436821]  ? __pfx_kthread+0x10/0x10
[  177.440595]  ret_from_fork+0x29/0x50
[  177.444201]  </TASK>
[  177.446402] ---[ end trace 0000000000000000 ]---
[  177.451052] qla2xxx [0000:af:00.6]-00b6:2: Device state is 0x1 =
Cold/Re-init.
[  177.458319] qla2xxx [0000:af:00.6]-009d:2: Device state is 0x1 =
Cold/Re-init.
[  177.465588] qla2xxx [0000:af:00.6]-009e:2: HW State: INITIALIZING.
[    0.020959] APIC: Disabling requested cpu. Processor 0/0x0 ignored.

Ming Lei <ming.lei@xxxxxxxxxx> 于2023年5月12日周五 23:06写道:
>
> Passthrough(pt) request shouldn't be queued to scheduler, especially some
> schedulers(such as bfq) supposes that req->bio is always available and
> blk-cgroup can be retrieved via bio.
>
> Sometimes pt request could be part of error handling, so it is better to always
> queue it into hctx->dispatch directly.
>
> Fix this issue by queuing pt request from plug list to hctx->dispatch
> directly.
>
> Reported-by: Guangwu Zhang <guazhang@xxxxxxxxxx>
> Investigated-by: Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>
> Fixes: 1c2d2fff6dc0 ("block: wire-up support for passthrough plugging")
> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> ---
> Guang Wu, please test this patch and provide us the result.
>
>  block/blk-mq.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f6dad0886a2f..11efaefa26c3 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2711,6 +2711,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
>         struct request *requeue_list = NULL;
>         struct request **requeue_lastp = &requeue_list;
>         unsigned int depth = 0;
> +       bool pt = false;
>         LIST_HEAD(list);
>
>         do {
> @@ -2719,7 +2720,9 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
>                 if (!this_hctx) {
>                         this_hctx = rq->mq_hctx;
>                         this_ctx = rq->mq_ctx;
> -               } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) {
> +                       pt = blk_rq_is_passthrough(rq);
> +               } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx ||
> +                               pt != blk_rq_is_passthrough(rq)) {
>                         rq_list_add_tail(&requeue_lastp, rq);
>                         continue;
>                 }
> @@ -2731,10 +2734,15 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
>         trace_block_unplug(this_hctx->queue, depth, !from_sched);
>
>         percpu_ref_get(&this_hctx->queue->q_usage_counter);
> -       if (this_hctx->queue->elevator) {
> +       if (this_hctx->queue->elevator && !pt) {
>                 this_hctx->queue->elevator->type->ops.insert_requests(this_hctx,
>                                 &list, 0);
>                 blk_mq_run_hw_queue(this_hctx, from_sched);
> +       } else if (pt) {
> +               spin_lock(&this_hctx->lock);
> +               list_splice_tail_init(&list, &this_hctx->dispatch);
> +               spin_unlock(&this_hctx->lock);
> +               blk_mq_run_hw_queue(this_hctx, from_sched);
>         } else {
>                 blk_mq_insert_requests(this_hctx, this_ctx, &list, from_sched);
>         }
> --
> 2.38.1
>


-- 

Guangwu Zhang, RHCE, ISTQB, ITIL

Quality Engineer, Kernel Storage QE

Red Hat





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux