> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Sent: Wednesday, July 21, 2021 2:17 PM > To: Dexuan Cui <decui@xxxxxxxxxxxxx>; Saeed Mahameed > > On Mon, Jul 19 2021 at 20:33, Dexuan Cui wrote: > > This is a bare metal x86-64 host with Intel CPUs. Yes, I believe the > > issue is in the IOMMU Interrupt Remapping mechanism rather in the > > NIC driver. I just don't understand why bringing the CPUs online and > > offline can work around the issue. I'm trying to dump the IOMMU IR > > table entries to look for any error. > > can you please enable GENERIC_IRQ_DEBUGFS and provide the output of > > cat /sys/kernel/debug/irq/irqs/$THENICIRQS > > Thanks, > > tglx Sorry for the late response! I checked the below sys file, and the output is exactly the same in the good/bad cases -- in both cases, I use maxcpus=8; the only difference in the good case is that I online and then offline CPU 8~31: for i in `seq 8 31`; do echo 1 > /sys/devices/system/cpu/cpu$i/online; done for i in `seq 8 31`; do echo 0 > /sys/devices/system/cpu/cpu$i/online; done # cat /sys/kernel/debug/irq/irqs/209 handler: handle_edge_irq device: 0000:d8:00.0 status: 0x00004000 istate: 0x00000000 ddepth: 0 wdepth: 0 dstate: 0x35409200 IRQD_ACTIVATED IRQD_IRQ_STARTED IRQD_SINGLE_TARGET IRQD_MOVE_PCNTXT IRQD_AFFINITY_SET IRQD_AFFINITY_ON_ACTIVATE IRQD_CAN_RESERVE IRQD_HANDLE_ENFORCE_IRQCTX node: 1 affinity: 0-7 effectiv: 5 pending: domain: INTEL-IR-MSI-3-3 hwirq: 0x6c00000 chip: IR-PCI-MSI flags: 0x30 IRQCHIP_SKIP_SET_WAKE IRQCHIP_ONESHOT_SAFE parent: domain: INTEL-IR-3 hwirq: 0x20000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0xd1 chip: APIC flags: 0x0 Vector: 42 Target: 5 move_in_progress: 0 is_managed: 0 can_reserve: 1 has_reserved: 0 cleanup_pending: 0 Thanks, Dexuan