On 8/3/2022 8:51 PM, Dexuan Cui wrote:
Jeffrey's 4 recent patches added Multi-MSI support to the pci-hyperv driver. Unluckily, one of the patches, i.e., b4b77778ecc5, causes a regression to a fio test for the Azure VM SKU Standard L64s v2 (64 AMD vCPUs, 8 NVMe drives): when fio runs against all the 8 NVMe drives, it runs fine with a low io-depth (e.g., 2 or 4); when fio runs with a high io-depth (e.g., 256), somehow queue-29 of each NVMe drive suddenly no longer receives any interrupts, and the NVMe core code has to abort the queue after a timeout of 30 seconds, and then queue-29 starts to receive interrupts again for several seconds, and later queue-29 no longer receives interrupts again, and this pattern repeats: [ 223.891249] nvme nvme2: I/O 320 QID 29 timeout, aborting [ 223.896231] nvme nvme0: I/O 320 QID 29 timeout, aborting [ 223.898340] nvme nvme4: I/O 832 QID 29 timeout, aborting [ 259.471309] nvme nvme2: I/O 320 QID 29 timeout, aborting [ 259.476493] nvme nvme0: I/O 321 QID 29 timeout, aborting [ 259.482967] nvme nvme0: I/O 322 QID 29 timeout, aborting Some other symptoms are: the throughput of the NVMe drives drops due to commit b4b77778ecc5. When the fio test is running, the kernel prints some soft lock-up messages from time to time. Commit b4b77778ecc5 itself looks good, and at the moment it's unclear where the issue is. While the issue is being investigated, restore the old behavior in hv_compose_msi_msg(), i.e., don't reuse the existing IRTE allocation for single-MSI and MSI-X. This is a stopgap for the above NVMe issue. Fixes: b4b77778ecc5 ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()") Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> Cc: Jeffrey Hugo <quic_jhugo@xxxxxxxxxxx> Cc: Carl Vanderlip <quic_carlv@xxxxxxxxxxx> ---
I'm sorry a regression has been discovered. Right now, the issue doesn't make sense to me. I'd love to know what you find out.
This stopgap solution appears reasonable to me. Reviewed-by: Jeffrey Hugo <quic_jhugo@xxxxxxxxxxx>