On Thu, Nov 12 2020 at 15:15, Thomas Gleixner wrote: > On Thu, Nov 12 2020 at 08:55, Jason Gunthorpe wrote: >> On Wed, Aug 26, 2020 at 01:16:28PM +0200, Thomas Gleixner wrote: >> They were unable to bisect further into the series because some of the >> interior commits don't boot :( >> >> When we try to load the mlx5 driver on a bare metal VF it gets this: >> >> [Thu Oct 22 08:54:51 2020] DMAR: DRHD: handling fault status reg 2 >> [Thu Oct 22 08:54:51 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request >> [Thu Oct 22 08:55:04 2020] mlx5_core 0000:42:00.1 eth4: Link down >> [Thu Oct 22 08:55:11 2020] mlx5_core 0000:42:00.1 eth4: Link up >> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: mlx5_cmd_eq_recover:264:(pid 3390): Recovered 1 EQEs on cmd_eq >> [Thu Oct 22 08:55:54 2020] mlx5_core 0000:42:00.2: wait_func_handle_exec_timeout:1051:(pid 3390): cmd0: CREATE_EQ(0Ã301) recovered after timeout >> [Thu Oct 22 08:55:54 2020] DMAR: DRHD: handling fault status reg 102 >> [Thu Oct 22 08:55:54 2020] DMAR: [INTR-REMAP] Request device [42:00.2] fault index 1600 [fault reason 37] Blocked a compatibility format interrupt request >> >> If you have any idea Ziyad and Itay can run any debugging you like. >> >> I suppose it is because this series is handing out compatability >> addr/data pairs while the IOMMU is setup to only accept remap ones >> from SRIOV VFs? > > So the issue seems to be that the VF device has the default irq domain > assigned and not the remapping domain. Let me stare into the code to see > how these VF devices are set up and registered with the IOMMU/remap > unit. Found the reason. Will fix it after walking the dogs. Brain needs some fresh air. Thanks, tglx