Re: [PATH stable 5.15,5.10 0/4] Fix EBS volume attach on AWS ARM instances

Greg KH <greg@xxxxxxxxx> · Wed, 30 Nov 2022 18:11:47 +0100

On Mon, Nov 28, 2022 at 05:08:31PM +0000, Luiz Capitulino wrote:
> Hi,
> 
> [ Marc, can you help reviewing? Esp. the first patch? ]
> 
> This series of backports from upstream to stable 5.15 and 5.10 fixes an issue
> we're seeing on AWS ARM instances where attaching an EBS volume (which is a
> nvme device) to the instance after offlining CPUs causes the device to take
> several minutes to show up and eventually nvme kworkers and other threads start
> getting stuck.
> 
> This series fixes the issue for 5.15.79 and 5.10.155. I can't reproduce it
> on 5.4. Also, I couldn't reproduce this on x86 even w/ affected kernels.
> 
> An easy reproducer is:
> 
> 1. Start an ARM instance with 32 CPUs
> 2. Once the instance is booted, offline all CPUs but CPU 0. Eg:
>    # for i in $(seq 1 32); do chcpu -d $i; done
> 3. Once the CPUs are offline, attach an EBS volume
> 4. Watch lsblk and dmesg in the instance
> 
> Eventually, you get this stack trace:
> 
> [   71.842974] pci 0000:00:1f.0: [1d0f:8061] type 00 class 0x010802
> [   71.843966] pci 0000:00:1f.0: reg 0x10: [mem 0x00000000-0x00003fff]
> [   71.845149] pci 0000:00:1f.0: PME# supported from D0 D1 D2 D3hot D3cold
> [   71.846694] pci 0000:00:1f.0: BAR 0: assigned [mem 0x8011c000-0x8011ffff]
> [   71.848458] ACPI: \_SB_.PCI0.GSI3: Enabled at IRQ 38
> [   71.850852] nvme nvme1: pci function 0000:00:1f.0
> [   71.851611] nvme 0000:00:1f.0: enabling device (0000 -> 0002)
> [  135.887787] nvme nvme1: I/O 22 QID 0 timeout, completion polled
> [  197.328276] nvme nvme1: I/O 23 QID 0 timeout, completion polled
> [  197.329221] nvme nvme1: 1/0/0 default/read/poll queues
> [  243.408619] INFO: task kworker/u64:2:275 blocked for more than 122 seconds.
> [  243.409674]       Not tainted 5.15.79 #1
> [  243.410270] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  243.411389] task:kworker/u64:2   state:D stack:    0 pid:  275 ppid:     2 flags:0x00000008
> [  243.412602] Workqueue: events_unbound async_run_entry_fn
> [  243.413417] Call trace:
> [  243.413797]  __switch_to+0x15c/0x1a4
> [  243.414335]  __schedule+0x2bc/0x990
> [  243.414849]  schedule+0x68/0xf8
> [  243.415334]  schedule_timeout+0x184/0x340
> [  243.415946]  wait_for_completion+0xc8/0x220
> [  243.416543]  __flush_work.isra.43+0x240/0x2f0
> [  243.417179]  flush_work+0x20/0x2c
> [  243.417666]  nvme_async_probe+0x20/0x3c
> [  243.418228]  async_run_entry_fn+0x3c/0x1e0
> [  243.418858]  process_one_work+0x1bc/0x460
> [  243.419437]  worker_thread+0x164/0x528
> [  243.420030]  kthread+0x118/0x124
> [  243.420517]  ret_from_fork+0x10/0x20
> [  258.768771] nvme nvme1: I/O 20 QID 0 timeout, completion polled
> [  320.209266] nvme nvme1: I/O 21 QID 0 timeout, completion polled
> 
> For completion, I tested the same test-case on x86 with this series applied
> on 5.15.79 and 5.10.155 as well. It works as expected.

All now queued up, thanks.

greg k-h