Jason, (trimmed CC list a bit) On Thu, Nov 12 2020 at 08:55, Jason Gunthorpe wrote: > On Wed, Aug 26, 2020 at 01:16:28PM +0200, Thomas Gleixner wrote: > They were unable to bisect further into the series because some of the > interior commits don't boot :( > > When we try to load the mlx5 driver on a bare metal VF it gets this: > > [Thu Oct 22 08:54:51 2020] DMAR: DRHD: handling fault status reg 2 > [Thu Oct 22 08:54:51 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request > [Thu Oct 22 08:55:04 2020] mlx5_core 0000:42:00.1 eth4: Link down > [Thu Oct 22 08:55:11 2020] mlx5_core 0000:42:00.1 eth4: Link up > [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: mlx5_cmd_eq_recover:264:(pid 3390): Recovered 1 EQEs on cmd_eq > [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: wait_func_handle_exec_timeout:1051:(pid 3390): cmd0: CREATE_EQ(0Ã301) recovered after timeout > [Thu Oct 22 08:55:54 2020] DMAR: DRHD: handling fault status reg 102 > [Thu Oct 22 08:55:54 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request > > If you have any idea Ziyad and Itay can run any debugging you like. > > I suppose it is because this series is handing out compatability > addr/data pairs while the IOMMU is setup to only accept remap ones > from SRIOV VFs? So the issue seems to be that the VF device has the default irq domain assigned and not the remapping domain. Let me stare into the code to see how these VF devices are set up and registered with the IOMMU/remap unit. Thanks, tglx