Re: [PATCHv3 2/2] nvme: cancel requests for real

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It did come back.  Just took a little longer.

Alan

[  394.319378] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  394.334228] nvme nvme0: new error during reset
[  394.340125] nvme nvme0: Shutdown timeout set to 10 seconds
[  394.343452] nvme nvme0: 56/0/0 default/read/poll queues
[  394.343557] FAULT_INJECTION: forcing a failure.
               name fail_io_timeout, interval 1, probability 100, space 0, times 995
[  394.343559] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  394.343559] Hardware name: Oracle Corporation ORACLE SERVER X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  394.343560] Call Trace:
[  394.343561]  <IRQ>
[  394.343563]  dump_stack+0x6d/0x9a
[  394.343564]  should_fail.cold.5+0x32/0x42
[  394.343566]  blk_should_fake_timeout+0x26/0x30
[  394.343566]  blk_mq_complete_request+0x15/0x30
[  394.343569]  nvme_irq+0xd9/0x1f0 [nvme]
[  394.343571]  __handle_irq_event_percpu+0x44/0x190
[  394.343573]  handle_irq_event_percpu+0x32/0x80
[  394.343574]  handle_irq_event+0x3b/0x5a
[  394.343575]  handle_edge_irq+0x87/0x190
[  394.343577]  do_IRQ+0x54/0xe0
[  394.343578]  common_interrupt+0xf/0xf
[  394.343579]  </IRQ>
[  394.343581] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  394.343582] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00 [  394.343583] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd [  394.343584] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 000000000000001f [  394.343585] RDX: 0000005bd0b51f04 RSI: 0000000031573862 RDI: 0000000000000000 [  394.343585] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 000000000002c4c0 [  394.343586] R10: 011c65ca3dee2b0c R11: ffff9b537fc6bb44 R12: 0000000000000002 [  394.343586] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: ffffffffa374c1f0
[  394.343588]  cpuidle_enter+0x2e/0x40
[  394.343590]  call_cpuidle+0x23/0x40
[  394.343591]  do_idle+0x230/0x270
[  394.343592]  cpu_startup_entry+0x1d/0x20
[  394.343594]  start_secondary+0x170/0x1c0
[  394.343596]  secondary_startup_64+0xb6/0xc0
[  424.524341] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  424.524389] nvme nvme0: Abort status: 0x0
[  454.729622] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  454.740442] nvme nvme0: new error during reset
[  454.750403] nvme nvme0: Shutdown timeout set to 10 seconds
[  454.753508] nvme nvme0: 56/0/0 default/read/poll queues
[  454.753596] FAULT_INJECTION: forcing a failure.
               name fail_io_timeout, interval 1, probability 100, space 0, times 994
[  454.753598] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  454.753599] Hardware name: Oracle Corporation ORACLE SERVER X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  454.753599] Call Trace:
[  454.753600]  <IRQ>
[  454.753603]  dump_stack+0x6d/0x9a
[  454.753604]  should_fail.cold.5+0x32/0x42
[  454.753605]  blk_should_fake_timeout+0x26/0x30
[  454.753606]  blk_mq_complete_request+0x15/0x30
[  454.753609]  nvme_irq+0xd9/0x1f0 [nvme]
[  454.753612]  __handle_irq_event_percpu+0x44/0x190
[  454.753613]  handle_irq_event_percpu+0x32/0x80
[  454.753615]  handle_irq_event+0x3b/0x5a
[  454.753616]  handle_edge_irq+0x87/0x190
[  454.753617]  do_IRQ+0x54/0xe0
[  454.753619]  common_interrupt+0xf/0xf
[  454.753620]  </IRQ>
[  454.753622] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  454.753623] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00 [  454.753624] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd [  454.753625] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 000000000000001f [  454.753626] RDX: 00000069e16d295b RSI: 0000000031573862 RDI: 0000000000000000 [  454.753626] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 000000000002c4c0 [  454.753627] R10: 011c65eebaf0c79e R11: ffff9b537fc6bb44 R12: 0000000000000002 [  454.753627] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: ffffffffa374c1f0
[  454.753630]  cpuidle_enter+0x2e/0x40
[  454.753632]  call_cpuidle+0x23/0x40
[  454.753633]  do_idle+0x230/0x270
[  454.753634]  cpu_startup_entry+0x1d/0x20
[  454.753636]  start_secondary+0x170/0x1c0
[  454.753637]  secondary_startup_64+0xb6/0xc0
[  484.934587] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  484.934641] nvme nvme0: Abort status: 0x0
[  492.613861] INFO: task kworker/u114:2:1182 blocked for more than 245 seconds.
[  492.613884]       Not tainted 5.7.0-rc7+ #2
[  492.613912] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  492.613948] kworker/u114:2  D    0  1182      2 0x80004000
[  492.613953] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[  492.613954] Call Trace:
[  492.613956]  __schedule+0x2dc/0x710
[  492.613958]  ? __kfifo_to_user_r+0x90/0x90
[  492.613960]  schedule+0x44/0xb0
[  492.613961]  blk_mq_freeze_queue_wait+0x4b/0xb0
[  492.613962]  ? finish_wait+0x80/0x80
[  492.613963]  blk_mq_freeze_queue+0x1a/0x20
[  492.613966]  nvme_update_disk_info+0x62/0x3b0 [nvme_core]
[  492.613968]  __nvme_revalidate_disk+0x8d/0x140 [nvme_core]
[  492.613970]  nvme_revalidate_disk+0xa4/0x140 [nvme_core]
[  492.613971]  ? blk_mq_run_hw_queue+0xba/0x100
[  492.613973]  revalidate_disk+0x2b/0xa0
[  492.613975]  nvme_validate_ns+0x46/0x5b0 [nvme_core]
[  492.613978]  ? __nvme_submit_sync_cmd+0xe0/0x1b0 [nvme_core]
[  492.613980]  nvme_scan_work+0x25a/0x310 [nvme_core]
[  492.613983]  process_one_work+0x1ab/0x380
[  492.613984]  worker_thread+0x37/0x3b0
[  492.613985]  kthread+0x120/0x140
[  492.613990]  ? create_worker+0x1b0/0x1b0
[  492.613992]  ? kthread_park+0x90/0x90
[  492.614011]  ret_from_fork+0x35/0x40
[  515.139879] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  515.150634] nvme nvme0: new error during reset
[  515.158310] blk_update_request: I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [  515.158346] Buffer I/O error on dev nvme0n1, logical block 0, async page read
[  515.160800] nvme nvme0: Shutdown timeout set to 10 seconds
[  515.163954] nvme nvme0: 56/0/0 default/read/poll queues
[root@localhost ~]#



On 5/28/20 12:14 PM, Keith Busch wrote:
On Thu, May 28, 2020 at 12:11:18PM -0700, Keith Busch wrote:
You can expect your fake timeout with 100% probablility to take 150
seconds minimum to return a retryable request.
Sorry, I should have said 300 seconds. We restart the timer for aborts,
so 5 requeues * 30 seconds * 2 timer starts = 300 seconds.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux