Re: [PATCH] block: Revert v5.0 blk_mq_request_issue_directly() changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bart Van Assche <bvanassche@xxxxxxx> writes:
> blk_mq_try_issue_directly() can return BLK_STS*_RESOURCE for requests that
> have been queued. If that happens when blk_mq_try_issue_directly() is called
> by the dm-mpath driver then dm-mpath will try to resubmit a request that is
> already queued and a kernel crash follows. Since it is nontrivial to fix
> blk_mq_request_issue_directly(), revert the blk_mq_request_issue_directly()
> changes that went into kernel v5.0.
>
> This patch reverts the following commits:
> * d6a51a97c0b2 ("blk-mq: replace and kill blk_mq_request_issue_directly") # v5.0.
> * 5b7a6f128aad ("blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests") # v5.0.
> * 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.

At least when this patch is cherry-picked back on top of 5.0.7, it
doesn't seem to fully fix the problem for us (which is ppc64le). We hit
this at some point while eudev is finding disks and we have a process
mounting them to see what's there:

cpu 0x4a: Vector: 400 (Instruction Access) at [c000203ff398b100]
    pc: c0000000021fe700
    lr: c0000000002001d8: blk_mq_complete_request+0x34/0x138
    sp: c000203ff398b390
   msr: 9000000010009033
  current = 0xc000203ff3901b00
  paca    = 0xc000203fff7f3680	 irqmask: 0x03	 irq_happened: 0x01
    pid   = 658, comm = kworker/u257:2
Linux version 5.0.7-openpower1 (hostboot@xxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 6.5.0 (Buildroot 2019.02.1-00016-ge01dcd0)) #2 SMP Mon Apr 8 09:14:53 CDT 2019

[link register   ] c0000000002001d8 blk_mq_complete_request+0x34/0x138
[c000203ff398b390] c000203ff398b3d0 (unreliable)
[c000203ff398b3d0] c0000000002d0428 scsi_mq_done+0x48/0x6c
[c000203ff398b410] c0000000002ee8d8 ata_scsi_simulate+0x78/0x29c
[c000203ff398b460] c0000000002eee74 ata_scsi_queuecmd+0xa8/0x2cc
[c000203ff398b4a0] c0000000002d2ac4 scsi_queue_rq+0x7bc/0x810
[c000203ff398b540] c000000000202570 blk_mq_dispatch_rq_list+0x474/0x5c0
[c000203ff398b600] c000000000207560 blk_mq_sched_dispatch_requests+0x114/0x18c
[c000203ff398b660] c0000000002009e4 __blk_mq_run_hw_queue+0xe0/0xf8
[c000203ff398b6e0] c000000000200a5c __blk_mq_delay_run_hw_queue+0x60/0x184
[c000203ff398b740] c000000000200c20 blk_mq_run_hw_queue+0x70/0xe4
[c000203ff398b790] c0000000002077ec blk_mq_sched_insert_request+0x68/0x194
[c000203ff398b7f0] c0000000001fc1d8 blk_execute_rq_nowait+0x78/0x8c
[c000203ff398b810] c0000000001fc234 blk_execute_rq+0x48/0x90
[c000203ff398b860] c0000000002cfa10 __scsi_execute+0xd8/0x1ac
[c000203ff398b8d0] c0000000002cc31c ioctl_internal_command.constprop.2+0x50/0x144
[c000203ff398b980] c0000000002cc474 scsi_set_medium_removal+0x64/0x9c
[c000203ff398b9c0] c0080000101f2754 sd_open+0xe8/0x148 [sd_mod]
[c000203ff398ba00] c00000000018b730 __blkdev_get+0x198/0x3e4
[c000203ff398ba70] c00000000018b9cc blkdev_get+0x50/0x35c
[c000203ff398bb10] c00000000020b140 __device_add_disk+0x468/0x488
[c000203ff398bbd0] c0080000101f5484 sd_probe_async+0xd4/0x180 [sd_mod]
[c000203ff398bc60] c00000000009a128 async_run_entry_fn+0x68/0x138
[c000203ff398bca0] c000000000090858 process_one_work+0x204/0x334
[c000203ff398bd30] c000000000090ca4 worker_thread+0x2d0/0x394
[c000203ff398bdb0] c000000000096cb4 kthread+0x14c/0x154
[c000203ff398be20] c00000000000b72c ret_from_kernel_thread+0x5c/0x70

4a:mon> r

R00 = c0000000002d0428   R16 = 0000000000000001
R01 = c000203ff398b390   R17 = 0000000000000000
R02 = c000000001d5af00   R18 = 0000000000000004
R03 = c000003feb215600   R19 = 0000000000000000
R04 = c000003feb1a1900   R20 = c000003feb215600
R05 = 0000000000000005   R21 = c000003feb660000
R06 = 0000000000000020   R22 = 0000000000000402
R07 = 0000000000000000   R23 = 0000000000000000
R08 = c000003fecf86300   R24 = c000003fecf22800
R09 = 0000000000000001   R25 = c000003fecf2e800
R10 = 0000000000000000   R26 = c000003feb0f5000
R11 = c0080000101f5b20   R27 = c000003feb215720
R12 = c000203ffc1ff500   R28 = c000003fecf2e800
R13 = c000203fff7f3680   R29 = c000003feb2158b0
R14 = c000000000096b70   R30 = c000003feb660000
R15 = c000203ff3c10000   R31 = c000003feb215720
pc  = c0000000021fe700
cfar= c0000000002001d4 blk_mq_complete_request+0x30/0x138
lr  = c0000000002001d8 blk_mq_complete_request+0x34/0x138
msr = 9000000010009033   cr  = 24022222
ctr = c0000000002d03e0   xer = 0000000000000000   trap =  400
4a:mon> 
 special_registers=
S

msr    = 9000000000001033  sprg0 = 0000000000000000
pvr    = 00000000004e1202  sprg1 = 0000000000000000
dec    = ffffffe807402ddf  sprg2 = 0000000000000000
sp     = c000203ff398ab60  sprg3 = 000000000008004a
toc    = c000000001d5af00  dar   = 00000000100be2af
srr0   = c0000000021fe700  srr1  = 9000000010009033 dsisr  = 40000000
dscr   = 0000000000000010  ppr   = 0010000000000000 pir    = 0000080a
amr    = 0000000000000000  uamor = 0000000000000000
sdr1   = 0000000048022224  hdar  = 0000000000000000 hdsisr = 00000000
hsrr0  = c000000000051708 hsrr1  = 9000000000001033 hdec   = ffffffe109d0f8da
lpcr   = 0040400001d2f012  pcr   = 0000000000000000 lpidr  = 00000000
hsprg0 = c000203fff7f3680 hsprg1 = c000203fff7f3680 amor   = 0000000000000000
dabr   = 0000000048022224 dabrx  = c000000000051708
dpdes  = 0000000000000000  tir   = 0000000000000002 cir    = 00000000
fscr   = 0000000000000180  tar   = 0000000000000000 pspb   = 00000000
mmcr0  = 0000000000000000  mmcr1 = 0000000000000000 mmcr2  = 0000000000000000
pmc1   = 00000000 pmc2 = 00000000  pmc3 = 00000000  pmc4   = 00000000
mmcra  = 0000000000000000   siar = 0000000000000000 pmc5   = 80000031
sdar   = 0000000000000000   sier = 0000000000000000 pmc6   = 80000001
ebbhr  = 0000000000000000  ebbrr = 0000000000000000 bescr  = 0000000000000000
iamr   = 0000000000000000
hfscr  = 000000000000059f  dhdes = c000000000051708 rpr    = 00000103070f1f3f
dawr   = 0000000000000000  dawrx = 0000000000000000 ciabr  = 0000000000000000
pidr   = 000000000000001a  tidr  = 0000000000000000
asdr   = 0000000000000000  psscr = 2000000000300332
ptcr   = 0000203fff7e0004



For reference, the backtrace we got with near pure 5.0.6 (one patch
in xhci):

cpu 0x27: Vector: 380 (Data Access Out of Range) at [c000001ff537b2d0]
    pc: c0000000001fe754: blk_add_timer+0x2c/0xa4
    lr: c0000000002001d4: blk_mq_start_request+0xa8/0xe0
    sp: c000001ff537b560
   msr: 9000000000009033
   dar: 43f900c0000020d3
  current = 0xc000001ff52dd880
  paca    = 0xc000001ffffe1f80     irqmask: 0x03     irq_happened: 0x01
    pid   = 812, comm = kworker/u321:1
Linux version 5.0.6-openpower1 (hostboot@xxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 6.5.0 (Buildroot 2019.02.1-00015-gc5f183f)) #2 SMP Wed Apr 3 04:45:56 CDT 2019
enter ? for help
[c000001ff537b590] c0000000002001d4 blk_mq_start_request+0xa8/0xe0
[c000001ff537b5c0] c0000000002d25b0 scsi_queue_rq+0x508/0x810
[c000001ff537b660] c0000000002023c0 blk_mq_dispatch_rq_list+0x474/0x5c0
[c000001ff537b720] c000000000207360 blk_mq_sched_dispatch_requests+0x114/0x18c
[c000001ff537b780] c000000000200834 __blk_mq_run_hw_queue+0xe0/0xf8
[c000001ff537b800] c0000000002008ac __blk_mq_delay_run_hw_queue+0x60/0x184
[c000001ff537b860] c000000000200a70 blk_mq_run_hw_queue+0x70/0xe4
[c000001ff537b8b0] c0000000002075ec blk_mq_sched_insert_request+0x68/0x194
[c000001ff537b910] c0000000001fc028 blk_execute_rq_nowait+0x78/0x8c
[c000001ff537b930] c0000000001fc084 blk_execute_rq+0x48/0x90
[c000001ff537b980] c0000000002cf7b0 __scsi_execute+0xd8/0x1ac
[c000001ff537b9f0] c0000000002d3bd0 scsi_probe_and_add_lun+0x1ec/0xa48
[c000001ff537bb40] c0000000002d4778 __scsi_add_device+0xf4/0xf8
[c000001ff537bba0] c0000000002ef068 ata_scsi_scan_host+0xf4/0x1fc
[c000001ff537bc30] c0000000002e94b4 async_port_probe+0x6c/0x78
[c000001ff537bc60] c00000000009a0b0 async_run_entry_fn+0x68/0x138
[c000001ff537bca0] c0000000000907e0 process_one_work+0x204/0x334
[c000001ff537bd30] c000000000090c2c worker_thread+0x2d0/0x394
[c000001ff537bdb0] c000000000096c3c kthread+0x14c/0x154
[c000001ff537be20] c00000000000b72c ret_from_kernel_thread+0x5c/0x70
27:mon>

-- 
Stewart Smith
OPAL Architect, IBM.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux