Re: [bisected] bfq regression on latest linux-block/for-next

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/2/21 9:39 PM, Paolo Valente wrote:

Il giorno 1 apr 2021, alle ore 03:27, Yi Zhang <yi.zhang@xxxxxxxxxx> ha scritto:

Hi
Hi

We reproduced this bfq regression[3] on ppc64le with blktests[2] on the latest linux-block/for-next branch, seems it was introduced with [1] from my bisecting, pls help check it. Let me know if you need any testing for it, thanks.

Thanks for reporting this bug and finding the candidate offending commit. Could you try this test with my dev kernel, which might provide more information? The kernel is here:
https://github.com/Algodev-github/bfq-mq

Alternatively, I could try to provide you with patches to instrument your kernel.
HI Paolo
I tried your dev kernel, but with no luck to reproduce it, could you provide the debug patch based on latest linux-block/for-next?

The first option may be quicker.

In both cases, having KASAN active could be rather helpful too.

Looking forward to your feedback,
Paolo

[1]
commit 430a67f9d6169a7b3e328bceb2ef9542e4153c7c (HEAD, refs/bisect/bad)
Author: Paolo Valente <paolo.valente@xxxxxxxxxx>
Date:   Thu Mar 4 18:46:27 2021 +0100

    block, bfq: merge bursts of newly-created queues

[2] blktests: nvme_trtype=tcp ./check nvme/011

[3]
[  109.342525] run blktests nvme/011 at 2021-03-31 20:58:58
[  109.497429] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[  109.512868] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[  109.584932] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:9b73e8531afa4320963cc96571e0acb1.
[  109.585809] nvme nvme0: creating 128 I/O queues.
[  109.596570] nvme nvme0: mapped 128/0/0 default/read/poll queues.
[  109.654170] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420
[  155.366535] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[  155.366568] nvmet: ctrl 1 fatal error occurred!
[  155.366608] nvme nvme0: starting error recovery
[  155.374861] block nvme0n1: no usable path - requeuing I/O
[  155.374891] block nvme0n1: no usable path - requeuing I/O
[  155.374911] block nvme0n1: no usable path - requeuing I/O
[  155.374923] block nvme0n1: no usable path - requeuing I/O
[  155.374934] block nvme0n1: no usable path - requeuing I/O
[  155.374954] block nvme0n1: no usable path - requeuing I/O
[  155.374973] block nvme0n1: no usable path - requeuing I/O
[  155.374984] block nvme0n1: no usable path - requeuing I/O
[  155.375004] block nvme0n1: no usable path - requeuing I/O
[  155.375024] block nvme0n1: no usable path - requeuing I/O
[  155.375674] nvme nvme0: Reconnecting in 10 seconds...
[  180.967462] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
[  180.967486] nvmet: ctrl 2 fatal error occurred!
[  193.427906] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  193.427934] rcu: 	40-...0: (1 GPs behind) idle=f3e/1/0x4000000000000000 softirq=535/535 fqs=2839
[  193.427966] rcu: 	42-...0: (1 GPs behind) idle=792/1/0x4000000000000000 softirq=160/160 fqs=2839
[  193.427995] 	(detected by 5, t=6002 jiffies, g=7961, q=7219)
[  193.428030] Sending NMI from CPU 5 to CPUs 40:
[  199.235530] CPU 40 didn't respond to backtrace IPI, inspecting paca.
[  199.235551] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 217 (kworker/40:0H)
[  199.235579] Back trace of paca->saved_r1 (0xc000000012973380) (possibly stale):
[  199.235594] Call Trace:
[  199.235601] [c000000012973380] [c0000000129733e0] 0xc0000000129733e0 (unreliable)
[  199.235633] [c000000012973420] [c000000000933008] bfq_allow_bio_merge+0x78/0x110
[  199.235673] [c000000012973460] [c0000000008d7470] elv_bio_merge_ok+0x90/0xb0
[  199.235711] [c000000012973490] [c0000000008d811c] elv_merge+0x6c/0x1b0
[  199.235747] [c0000000129734e0] [c0000000008ea1d8] blk_mq_sched_try_merge+0x48/0x250
[  199.235785] [c000000012973540] [c00000000092d29c] bfq_bio_merge+0xfc/0x230
[  199.235820] [c0000000129735c0] [c0000000008f9e24] __blk_mq_sched_bio_merge+0xa4/0x210
[  199.235848] [c000000012973620] [c0000000008f288c] blk_mq_submit_bio+0xec/0x710
[  199.235877] [c0000000129736c0] [c0000000008dfad4] submit_bio_noacct+0x534/0x680
[  199.235915] [c000000012973760] [c0000000005e0308] iomap_dio_submit_bio+0xd8/0x100
[  199.235953] [c000000012973790] [c0000000005e0ed4] iomap_dio_bio_actor+0x3c4/0x570
[  199.235991] [c000000012973830] [c0000000005db284] iomap_apply+0x1f4/0x3e0
[  199.236017] [c000000012973920] [c0000000005e06e4] __iomap_dio_rw+0x204/0x5b0
[  199.236054] [c0000000129739f0] [c0000000005e0ab0] iomap_dio_rw+0x20/0x80
[  199.236090] [c000000012973a10] [c00800000e4ea660] xfs_file_dio_write_aligned+0xb8/0x1b0 [xfs]
[  199.236230] [c000000012973a60] [c00800000e4eabfc] xfs_file_write_iter+0x104/0x190 [xfs]
[  199.236368] [c000000012973a90] [c008000010440724] nvmet_file_submit_bvec+0xfc/0x160 [nvmet]
[  199.236412] [c000000012973b10] [c008000010440b54] nvmet_file_execute_io+0x2cc/0x3b0 [nvmet]
[  199.236455] [c000000012973b90] [c00800000fa142d8] nvmet_tcp_io_work+0xce0/0xd10 [nvmet_tcp]
[  199.236493] [c000000012973c70] [c000000000179534] process_one_work+0x294/0x580
[  199.236533] [c000000012973d10] [c0000000001798c8] worker_thread+0xa8/0x650
[  199.236568] [c000000012973da0] [c000000000184180] kthread+0x190/0x1a0
[  199.236595] [c000000012973e10] [c00000000000d4ec] ret_from_kernel_thread+0x5c/0x70
[  199.236710] Sending NMI from CPU 5 to CPUs 42:
[  205.044364] CPU 42 didn't respond to backtrace IPI, inspecting paca.
[  205.044381] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 3004 (kworker/42:2H)
[  205.044408] Back trace of paca->saved_r1 (0xc00000085194b520) (possibly stale):
[  205.044423] Call Trace:
[  205.044440] [c00000085194b520] [c00000079ac7f700] 0xc00000079ac7f700 (unreliable)
[  205.044468] [c00000085194b540] [c00000000092d248] bfq_bio_merge+0xa8/0x230
[  205.044494] [c00000085194b5c0] [c0000000008f9e24] __blk_mq_sched_bio_merge+0xa4/0x210
[  205.044531] [c00000085194b620] [c0000000008f288c] blk_mq_submit_bio+0xec/0x710
[  205.044569] [c00000085194b6c0] [c0000000008dfad4] submit_bio_noacct+0x534/0x680
[  205.044607] [c00000085194b760] [c0000000005e0308] iomap_dio_submit_bio+0xd8/0x100
[  205.044645] [c00000085194b790] [c0000000005e0ed4] iomap_dio_bio_actor+0x3c4/0x570
[  205.044672] [c00000085194b830] [c0000000005db284] iomap_apply+0x1f4/0x3e0
[  205.044698] [c00000085194b920] [c0000000005e06e4] __iomap_dio_rw+0x204/0x5b0
[  205.044735] [c00000085194b9f0] [c0000000005e0ab0] iomap_dio_rw+0x20/0x80
[  205.044771] [c00000085194ba10] [c00800000e4ea660] xfs_file_dio_write_aligned+0xb8/0x1b0 [xfs]
[  205.044897] [c00000085194ba60] [c00800000e4eabfc] xfs_file_write_iter+0x104/0x190 [xfs]
[  205.045021] [c00000085194ba90] [c008000010440724] nvmet_file_submit_bvec+0xfc/0x160 [nvmet]
[  205.045064] [c00000085194bb10] [c008000010440b54] nvmet_file_execute_io+0x2cc/0x3b0 [nvmet]
[  205.045106] [c00000085194bb90] [c00800000fa142d8] nvmet_tcp_io_work+0xce0/0xd10 [nvmet_tcp]
[  205.045144] [c00000085194bc70] [c000000000179534] process_one_work+0x294/0x580
[  205.045181] [c00000085194bd10] [c0000000001798c8] worker_thread+0xa8/0x650
[  205.045216] [c00000085194bda0] [c000000000184180] kthread+0x190/0x1a0
[  205.045242] [c00000085194be10] [c00000000000d4ec] ret_from_kernel_thread+0x5c/0x70




Best Regards,
  Yi Zhang






[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux